Modern society produces data at a speed, volume, and complexity that was unimaginable even a few decades ago. And there’s no sign of a slowdown on the horizon: in coming years, data production will continue to accelerate at a rate that will soon make contemporary concepts of “big data” seem quaint.
Consider just the built-in sensors on our smartphones. Smartphones come equipped with microphones, GPS, gyroscopes, cameras, magnetometers, accelerometers, pedometers, fingerprint readers, thermometers … the list goes on. Now think about the number of people carrying smartphones and the number of hours we spend using them. Those sensors are gathering data all the time; that data is being collected and stored. And that’s just the sensors—forget about the browsers and other apps collecting data—on just one type of device.
The International Data Corporation (IDC), a market intelligence agency, predicts that we will achieve a sum of 175 zettabytes of data by 2025. How much data is that? Here’s one way to envision it: the largest-capacity iPhone 11 can hold 256 GB of data and is 0.33 inches thick. Download 175 zettabytes onto iPhones, stack them up, and you’ll have 15 piles of iPhones that reach the moon. Lay them end-to-end and you’d get halfway to Mars.
Traditional single-machine computing and basic statistical analysis are no longer able to keep up with what we’ve dubbed, unimaginatively, big data. There are as many uses of big data as you can imagine, and “big data” versions of just about every computing career: big data engineer, big data analyst, big data architect, and—of course—big data developer.
So, what is a big data developer (BDD)? In this article, we’ll cover:
Big data developers work on the software side of large-scale data processing systems that businesses and research organizations use to turn vast quantities of data into useful information.
The lines between different information technology jobs are blurry, and many of the responsibilities of a big data developer overlap with “big data engineer” positions. These jobs are often posted in the context of a specific framework, e.g., “Hadoop developer.” As a rule of thumb, developers focus primarily on coding and have less administrative responsibility than do engineers or architects.
Almost all of the big tech companies—Google, Microsoft, Amazon, Apple, Facebook, etc.—need lots of big data developers and other data specialists. Big data developers also work as part of research teams that use big data to advance epidemiology and precision medicine, meteorology and climate science, and other fields. There are even roles in political science and government for big data developers who want to work on questions of public policy. In short, wherever there are large amounts of data, there is a need for big data developers.
In the simplest terms: big data developers write code. The core of the job is programming. You’ll spend a lot of time at a computer creating and troubleshooting code.
That said, you’ll face a variety of tasks and challenges. You’ll work in different programming languages and with various software, depending on the needs of a given project and its stage of completion. Early on, you’ll be planning solutions and writing a lot of new code; in the later stages, you’ll be focused on automating the entire data collection and analysis system so that it can take care of itself with minimal babysitting.
Throughout, the work of writing code will be interspersed with meetings with other team members, answering questions, and troubleshooting the programs you’ve written.
The essential skills you need to become a big data developer rest on the same foundation of software development skills as other jobs in the field. Before we get into the unique requirements of BD development, let’s review the basics:
Much of this specialization comes down to learning the unique software tools that BDDs use to develop and implement their ideas. The most important of these is Apache Hadoop, a suite of open-source programs and procedures that supplies all of the basic tools needed to perform Big Data analysis. It’s the most common system for data storage and processing using off-the-shelf hardware, to the point where “Hadoop developer” is nearly synonymous with “big data developer.” Though it’s not as ubiquitous (yet), Apache Spark is powerful data analytics software that integrates with Hadoop and seems poised to replace the data analytics module currently built into Hadoop.
Aspiring BDDs will also want to learn:
Anyone can become a big data developer with enough training. Still, a few qualities can help make the job a better fit: an affinity for logic, an eye for detail, and proficiency at learning languages will all help with the programming aspects of the job. A love of learning will also serve you well because a significant element of a big data developer’s job is keeping up with the latest trends in big data and software development. And, of course, a positive attitude and friendly disposition are an asset to anybody working as part of a team.
The initial path to becoming a big data developer overlaps substantially with other software-focused careers. Although it’s possible to learn many of the core skills of software development via free online courses and educational apps, the vast majority of developers start with a bachelor’s degree in computer science or information technology. It’s also possible to approach a career in big data from a mathematics background, particularly with a bachelor’s in statistics, which offers a strong grounding in the mathematical concepts that underpin big data development.
Once you’ve obtained a relevant undergraduate degree, many programs specializing in big data become available. These include certificate programs, master’s degrees, and doctorates. Generally speaking, pursuing a doctorate would be overkill if you’re just trying to break into big data development. We’ll take a closer look at certifications and master’s programs below.
You don’t need any certifications or accreditations to become a big data developer, but some programs may be useful in sharpening your skills or bolstering your résumé. These programs have the advantage of being less expensive and less time-consuming than full master’s or doctorate programs. Note that many of these programs are designed with aspiring managers in mind, so be sure the one you select is meant for developers, and not just tech-industry leadership professionals want to brush up on the latest in development.
There is no centralized professional association that recognizes or certifies big data developers, and as a result there is no standard list of certifications for this profession. Probably the biggest vendor in big data certifications and exams is Cloudera, which merged with Hortonworks (previously its biggest competitor) in 2019. Other vendors offer certifications and boot camps, as do prestigious (and not-so-prestigious) universities. Poke around job descriptions online and ask recruiters at job fairs what certifications they prefer to narrow your choices.
There are no dedicated graduate programs for big data developers yet, and a master’s degree isn’t needed to enter the field. That doesn’t mean, however, that these programs have no value. The best options can be sorted into two categories: software-oriented and business-oriented:
Lucky for you, the web is loaded with resources for anyone interested in learning how to work with big data. In fact, there may be too many—it can be overwhelming trying to find good resources. Here are a few recommendations:
You can also explore this detailed list of big data resources from Quorra or this list of free online courses on big data and Hadoop from Medium’s The Startup magazine.
We’ve walked through the first steps to become a big data developer: get a relevant undergraduate degree, learn to code, get a strong foundation in statistics and data science. But how does one actually enter the field?
Wherever you attend college, avail yourself of the career services office. This too-often-ignored campus resource should be able to offer you some help in finding employment. Furthermore, odds are good that there will be ways to accrue work experience before completing your degree—either through internship programs, summer employment, or specialized course arrangements. Seek these out. Employers value experience highly, in many cases more so than degrees and certifications.
Even though big data developers are a relatively specialized type of software developer, there are entry-level jobs available. If you’re having trouble breaking in, however, there are many entry-level developer jobs in other specialties (e.g., back-end web development) that can serve as a pathway into the broader field. Many of the core skills of software development are transferable across specialties.
If you opt for a traditional undergraduate program, it will take four years on average. However, if you’re teaching yourself big data development, it’s possible to do so in as little as a year or two, depending on how much time you can devote to your studies. Employers in the big data field tend to be more interested in a strong portfolio of work and demonstrated “real-world” skills than specific degrees, so if you’re a particularly dedicated auto-didact, it’s not impossible to skip the degree entirely.
So far we’ve managed to avoid talking about the one thing that may have put this career on your radar in the first place: big data jobs are generating a lot of buzz right now. Big data specialists are in high demand: an Indeed.com search for salaries related to big data offered a salary range of roughly $70,000 per year to $140,000 per year, which fits with the U.S. Bureau of Labor’s report that software developers earn a median income of approximately $105,000 per year, as of 2018.
For aspiring IT professionals—or individuals looking to change specialties—those are appealing numbers. Hopefully, some of the resources we’ve provided in this article will help you decide what path is best for you. You can get hands-on experience with many of the tools of software development from the comfort of your living room. If you find yourself interested in a programming career, you could do worse than consider big data development.
Questions or feedback? Email editor@noodle.com