It's difficult to overstate how many new features Google BigQuery has to offer.

When you first launch the service, you are presented with a stunning list of information about your query. You can learn about what type of dataset you have, how large it is, and for what period the data is stored.

You can even tell Google BigQuery to give you a demo of your dataset right away! I think it's a bit intimidating, but in a way, it's also beneficial. It can give you some real idea of the data in your dataset so you can think about how to organize it and how to make sense of it.

You can drill down to more specific data sets by clicking on the link in the upper right corner and getting detailed information about the dataset.

Google BigQuery is a speedy, extremely cost-efficient way to store and query terabytes or more of data. As you can see in the screenshot, you can see that I'm storing it on a Google Cloud platform.

Google_BigQuery_1

Google BigQuery also offers a unique approach to looking at large datasets in a new way called "Query Performance Analysis."

The acronym QPA isn't very exciting, but don't let that fool you. It's a very cool tool that lets you see how fast your queries are executing your data. With a long query running for a large dataset, you can often look at how quickly your query is executing in Google BigQuery.

Google BigQuery also offers a way to visualize the latency and throughput of your queries. You can use the streaming portal to see your queries as they run, or you can click the snapshot button and let Google show you the response time of each query.

Post Graduate Program In Cloud Computing

The Only Cloud Computing Program You Need TodayExplore Course
Post Graduate Program In Cloud Computing

How Do You Get Started Using Google BigQuery?

To use BigQuery, you'll need a Google Cloud Platform account, an email address, and a unique secret key. That's what I've set up so far, so if you don't already have a GCP account, sign up.

Next, click on the "Get Started" button and follow the wizard on the screen.

For downloading a big data dump, Google provides you with a website from which you can download an up-to-date spreadsheet. Download this file and place it somewhere you can find it easily.

Next, open the Google BigQuery console.

Creating a Dataset

The first thing you need to do is create a database and then connect to it.

It's possible to create datasets while you're in the cloud: start a BigQuery session, go into a data directory and create a new dataset. You can then connect to the newly created dataset when you're in the cloud and wait for the BigQuery server to start. That means that your data is saved locally on your machine.

Getting a Big Data Dump

Once you've connected to the BigQuery server, it's time to request a big data dump.

We'll focus on two features that you'll find useful for the future: First, you can customize the schedule. That means that you can schedule the database dump to be downloaded at a specific date and time. Second, you can also cancel a dataset by selecting "Cancel BigQuery Archive."

Let's do that!

Click on the Get Data tab at the top and press the Get Data button.

The first option (Get Data) lets you download a whole BigQuery dataset (more on this below).

The second option (Get Data Package) contains a zip file containing the compressed dataset. Just choose it and press OK.

In a few seconds, the zip file will be downloaded to your machine.

Free Course: Introduction to Cloud Computing

Learn the Fundamentals of Cloud ComputingEnroll Now
Free Course: Introduction to Cloud Computing

How to View and Load a Dataset

To load a dataset, you just need to choose it from the list and press the Load button.

Next, you will see a window with the data in question. Here you can customize some of the options.

The first option lets you download only selected rows. Select "View only selected rows" and press OK.

Select the "Load only selected rows" and press OK.

Now we'll download the filtered data (more on that below).

Select the "Filter selected rows" and press OK.

The next option (View only selected rows) allows you to download only selected rows. Select it and press OK.

The final option (Download only selected rows) is very useful for loading preprocessed data. Just like it and press OK.

In the Download only selected rows box, select the "All rows" option and press OK.

Using the Preprocessed Data

Now we need to convert the compressed data to the type that BigQuery understands: JSON.

BigQuery accepts many types of data, and the JSON format is one of them.

Just select the "Json" option from the Download selected rows box and press OK.

Just click the red button in the bottom-left corner to connect to the remote server and execute the code. You can then verify the result by running:

bigquery.github.com/advnat/index.html | JSON_ARRAY: '{"aggregation name":"count", "collection name":"aggregator", "aggregationType":"json"}'

If it works, you should see something like this:

Google_BigQuery_2

If you're not sure, you can check the JSON example file on my GitHub page.

Want to become a cloud computing pro? Our Cloud Computing Post Graduate course is all you need to become one. Explore more about the program now.

Closing Notes

All in all, my experience with BigQuery was relatively smooth and pleasant. BigQuery is quite easy to use and very flexible when downloading, sharing, and processing large datasets.

This project was written as a springboard for a deeper understanding of BigQuery. If you have any questions, leave a comment, and I'll try my best to respond. If you have suggestions, let me know.

Simplilearn offers various courses and programs in Big Data and cloud computing.  If you are focused on the Google Cloud Platform, you may want to consider the Google Cloud Platform Architect Certification Training course. If you are interested in learning in cross-platform cloud computing, look into the Post Graduate Program In Cloud Computing in collaboration with Caltech CTME. Or, if your professional interests lie in Big Data and data engineering, you might want to pursue the Post Graduate Program in Data Engineering in partnership with Purdue University.

About the Author

Matthew DavidMatthew David

Matt is a Digital Leader at Accenture. His passion is a combination of solving today's problems to run more efficiently, adjusting focus to take advantage of digital tools to improve tomorrow and move organizations to new ways of working that impact the future.

View More
  • Disclaimer
  • PMP, PMI, PMBOK, CAPM, PgMP, PfMP, ACP, PBA, RMP, SP, and OPM3 are registered marks of the Project Management Institute, Inc.