IPinfo on Google Cloud Marketplace

This document explores how to use our IP database within the Google Cloud Platform (GCP)'s data warehouse platform, BigQuery, delivered through Google Cloud Marketplace.

GCP Marketplace to BQ

IPinfo BigQuery Datasets available on Google Cloud Marketplace

Compared to our IP data being pushed to GCP’s Google Cloud Storage (GCS), data delivery via the Google Cloud Marketplace provides enhanced benefits, including:

  • Data is automatically ingested into the BigQuery platform. The organization/user does not have to write custom ingestion scripts.
  • IPinfo BigQuery datasets come with metadata information that helps users better monitor the data updates, freshness and health.
  • IPinfo BigQuery datasets also come prepackaged with IP lookup queries and solid documentation, making using our datasets a breeze.
  • BigQuery, as the data warehouse, acts as the central repository of all data. With the readily available IPinfo BigQuery dataset, IP enrichment can be done on IP data across different datasets, various storage buckets, cloud logging data, etc.

Features

IPinfo BigQuery Datasets

Currently, customers have the following options to access IPinfo data in their BigQuery environments:

  1. Visit the GCP Data Marketplace and leverage IPinfo’s free country + asn dataset via self serve consumption.
  2. Make a request in the GCP Data Marketplace if you are interested in exploring IPinfo’s broader portfolio of contextualized IP address datasets, delivered into BigQuery.

IP Lookup mechanism

IP address datasets are specialized and involve specific solutions for looking up IP metadata information via IP addresses. We highly recommend that users follow our instructions regarding IP lookup mechanisms instead of creating DIY solutions. We have researched to find the most optimal and performant solution for IP lookups.

Our UDF lookup mechanism supports both IPv4 and IPv6 addresses simultaneously.

Advanced User Defined Functions

Through requests, we can provide Advanced User Defined Functions that are extremely performant. Please note that these UDFs require preprocessing the dataset and writing some boilerplate code. However, they are extremely performant and can provide MMDB-like query speed. If you are interested in using the Advanced User Defined Functions with our free IP database, please let us know.

Bringing log data from Cloud Logging

BQ Log Analysis

As the IPinfo data resides in BigQuery, you can bring in log data from GCP Cloud Logging and perform IP metadata enrichment (geolocation, privacy, ASN, etc.) inside BigQuery. We recommend checking the documentation provided by Google, visit View logs routed to BigQuery.

We recommend ensuring that only distinct IP addresses are enriched instead of all the IP addresses in a log.

SELECT ip, ipinfo_free.country_asn(ip).*
FROM (SELECT DISTINCT ip FROM my_ips)

As Cloud logging supports many different sources, IPinfo's data inside BQ acts as a powerful data enrichment and analytics platform.

Our free IP database on Google Marketplace

IP to Country ASN DB in GCP Marketplace

IPinfo's free IP to Country ASN database is available for free through Google Marketplace. You can bring reliable IP data into your BQ environment within seconds. To learn more about the free IP to Country ASN database itself, see its documentation.

The IPinfo IP to Country ASN can be acquired in two ways:

Resources included

The IPinfo free IP to Country ASN BQ dataset contains the following resources:

  • ipinfo__free_ip_to_country_and_asn_database: The dataset (We recommend changing the dataset name to ipinfo_free)
    • Routines: Contains UDFs and helper functions
      • country_asn: UDF for the IP lookup query
    • country_asn: Core IP to Country ASN database
    • log_data: 1000 sample IP addresses with 500 IPv4 and 500 IPv6 addresses.

Using the country_asn UDF

IP to Country ASN BQ UDF demo

We package standard IP lookup UDFs with our BigQuery datasets. The UDF is available in the dataset's routine directory with the same name as the IP data table. Using the UDF requires no custom coding or writing your functions.

SELECT ip, ipinfo_free.country_asn(ip).*
FROM ipinfo_free.log_data

Code breakdown:

  • The log_data table contains IP addresses from traffic logs. The ip columns hold all the IP addresses.
  • The country_asn is a UDF packaged with the IP to Location IPinfo BigQuery dataset. Note that both the dataset and the UDF usually have the same name. However, the brackets note the UDF as a function ().
  • .* can return all the columns from the IP data table or even specific columns. For example, for city names, country_asn(ip).city, for ASN, country_asn(ip).asn.

Query performance

-- 1M IPv4
SELECT ip, `ipinfo-public`.ipinfo_free.ip_country_asn(ip).*
FROM (SELECT * FROM log WHERE ip NOT LIKE '%:%' LIMIT 1000000);
-- Elapsed: 9s
-- Slot time: 4m4s
-- Shuffled: 3.7GB

-- 1M IPv6
SELECT ip, `ipinfo-public`.ipinfo_free.ip_country_asn(ip).*
FROM (SELECT * FROM log WHERE ip LIKE '%:%' LIMIT 1000000);
-- Elapsed: 20s
-- Slot time: 5m42s
-- Shuffled: 3.83MB

-- 10M IPv6
SELECT ip, `ipinfo-public`.ipinfo_free.ip_country_asn(ip).*
FROM (SELECT * FROM log WHERE ip LIKE '%:%' LIMIT 10000000);
-- Elapsed: 5m30s
-- Slot time: 2h25m
-- Shuffled: 12.2GB

-- 100M IPv6
SELECT ip, `ipinfo-public`.ipinfo_free.ip_country_asn(ip).*
FROM (SELECT * FROM log WHERE ip LIKE '%:%' LIMIT 100000000);
-- Elapsed: 45m
-- Slot time: 1d18h
-- Shuffled: 91GB

Please note that our Advanced User Defined Functions are far more performant than our standard UDFs. They can be delivered upon request.


As this is a new development, we look forward to your feedback. If you have any feature requests or need support using the BQ IPinfo dataset, please create a post in our IPinfo Community.