The Alternative Data Industry

What is alternative data? πŸ“Š

In a world where everyone is looking for an edge, industry leaders are increasingly looking for data that is not readily available to othersβ€”That's where alternative data comes in. First, let's start with what we would consider traditional data sources. Traditional data sources come from within a company, such as financial statements, SEC filings, press releases, earnings calls, etc. Alternative Data refers to data used to evaluate a company outside those traditional data sets. We'll talk more about the different variations of alt data later. Before we do that, let's get into who uses it.

Who uses alternative data?πŸ‘¨β€πŸ’»πŸ‘©β€πŸ’»

There are many cases on the different personnel & organizations who use alternative data, but let's stick to the most prominent sectors (so far). 

Hedge Funds & Mutual Funds

Buy-side firms such as hedge funds and mutual funds LOVE alternative data, and their user base is the biggest consumer of it. These firms are constantly searching for an edge in the market. Alternative data provides firms like these with a massive advantage in making an investment decision, such as deciding a short position, selecting a long-play, or buying or selling a stock before the company's earnings call.

Private Equity

Based on what I've gathered, alternative data is still relatively new in private equity regarding its usage. However, alternative data can give the private equity industry significant insight into rising industries to help navigate their future investments. Or possibly maybe even discover a new one. For example, I am currently looking at the analytics from the App Store, and I can see that the top six downloaded apps are Coinbase, TikTok, Youtube, Facebook, Instagram, and Robinhood. I can also see that these apps cover two categories, social media & finance. So, in theory, if I were in private equity, I would want to start seriously looking into companies to invest in that cover both of those categories (which would be Public)

Business Valuations

Alternative Data can provide lots of value in business valuations in the private market to help provide more context. Private company data is not as available or as reliable as public company data. Alternative data can deliver more transparency to provide a proper business valuation. 

Credit Underwritings 

Equifax, which owns alternative data supplier DataX, estimates that around 10% of adult Americans lack the adequate credit history to allow traditional credit to be used or assess their risk. Firms can evaluate payment history for a consumer's everyday bills, such as cable television, utilities, and cellphone service, by turning to other sources. As a result, alternative data has enabled banks and insurers to tap into a previously "invisible" market of millions of people. 

Insurance 

The utilization of alternative data also improves the sectors of life, health, car, and home insurance. In this case, the insured produces their own data and provides it with their insurance company in exchange for a discount. Because of their almost unlimited access to alternative data sets through subscriptions and hardware, Apple and Amazon could be a significant force in the insurance industry if they ever choose to eventually jump in (which both have been rumored to for some time). Over a billion people use Apple products in the US, and around 150M people are subscribed to Amazon Prime.

How is alt data manufactured? πŸ‘¨β€πŸ‘©β€πŸ‘§β€πŸ‘¦πŸ“±πŸ›° 

Three primary sources create alternative data; individuals, businesses, and tech (specifically sensors). 

Individuals

We, as individuals, are constantly contributing to alternative data sets. Some examples of this are:

  • Google searches

  • product purchases 

  • social media engagement

  • product reviews 

  • web traffic 

  • app usage

  • and much more 

Alternative data produced by individuals can be critical in understanding human behavior towards products, trends, and industries. But more importantly, it can play a role in forecasting expected human behavior based on large datasets compared to similar cases in the past. 

Businesses

Businesses also contribute to alternative datasets. A lot of you may be thinking, "JT, you're a liar. You said data that comes from businesses are traditional data sets! You stink!". In which I would respond with, please settle down, yes, you are correct; data that comes within a business is traditional. However, the business datasets I'm talking about are generated by the business, but the business does not gather or deliver it; outside sources do. Some examples of these are:

  • Banking Records

  • Supply Chain Data

  • Variations of Corporate Data

Alternative data produced by businesses can be crucial to understanding historical trends. 

Tech/Sensors

Tech/sensor data also plays a massive part in creating alternative data sets. Devices enabled by tech are constantly sending signals to each other. The capture process of these signals is what we refer to as sensor data. Some examples of these are:

  • Geolocation Data

  • Weather Forecast

  • Satellite Images

  • Shipping Data

  • and much more

Sensor/tech data can help with creating more predictive models. They can help you connect some dots that may not entirely connect yet but look to be in line, sort of speak. For example, suppose the weather forecast predicts a colder than usual winter. In that case, you can conclude that more households will be cranking up their thermostats, resulting in a spike in oil usage.

Variations of alternative data πŸ—‚

There are A LOT of alternative data sets. If there's a list out there, it's a running one because it seems like somebody is finding a new source to pull data from every day. But that's also what's so cool about alternative data sets; the use cases and variations are pretty much endless. Below is a couple of examples:

  • Web Site Usage - How much individuals visit, revisit, and time spent on a website. 

  • Social Sentiment - A way of measuring emotions by mainly utilizing social media. Social sentiment does not measure how much or how many people are talking about a company; instead, it provides context by assessing the tone of the user’s social activity surrounding the company. This is captured by using natural language processing and machine learning.

  • Geolocation - identifies the geographic location of people, usually through our mobile devices. 

  • Credit Card Transactions - this is self-explanatory.

  • Point of Sales Systems - refers to the location where a consumer makes a payment for products or services and the location where sales taxes may be due. It could be a physical store with POS terminals and systems processing card payments or a virtual sales point like a computer or mobile electronic device.

  • Satellite Imagery - these are images of earth collected by imaging satellites operated by governments or companies worldwide. Satellite imaging companies sell images by licensing them to governments and businesses. 

  • Weather Forecasts - this is also self-explanatory, but weather forecasting is executed by intersecting science and technology to predict the conditions of the atmosphere for a specific location and time.

  • Product Reviews - a report provided by customers on a commercial website to assist people in deciding whether to purchase the product or not. 

  • Shipping Data - Shipping data is key to gaining insight into the expected cost of shipping a product domestically and internationally. You can have a great product, but not having a grasp on what it's financially and logistically going to take to deliver it to the customer can painfully stunt a company's growth (i.e., Peloton). 

How is alternative data used?πŸ“ 

The critical thing to remember is the data itself does not necessarily spit out an answer. That perceived action is a common misconception by consumers when it comes to data and tech. The value of alternative data depends on the personnel consuming it and the systems in place. Someone with industry expertise in possession of both traditional and alternative data sets will have the ability to connect the dots to produce a strong forecast. An excellent example of this would be a Major League Baseball personnel looking at a position player's stats (traditional data) and pairing it with their ball flight and biomechanical data sets (alternative data). These data sets can paint you a pretty good picture of what the player is doing, how they are doing it, they are successful, if there is room for improvement, and if there is, a plan to help the player improve. The value in that process comes from the person who is evaluating the data, using their industry knowledge with automated systems in place to connect the dots. When evaluating companies and industries, using traditional and alternative data sets can be used the same way. 

Below is a really cool video by FintechOrama using Billions to highlight the importance of alternative data combined with industry knowledge to find alpha (alpha is the measure of an active return on an investment). I highly recommend checking out more of his content if you’re interested in this type of stuff.

What are the benefits of alternative data?πŸ˜ƒ

Alternative data has a ton of benefits. 

Real-Time Access

One of the most significant benefits of alternative data is that it's real-time, or as close as real-time as you can get in a data set. These data sets are constantly captured and reported, making it a very fluid process. 

Time-Saving

Alternative data can save a lot of time when piecing together an analysis of a company or an industry. For example, the combination of social media posts, product reviews, and natural language processing can give you a solid social sentiment on a specific product, company, trend, or industry. 

Influence New Investment Ideas

Alternative data can also influence new investment ideas. For example, as of late, you may notice two specific categories attached to the top apps in the apple store, social-networking & finance. A (good) investor looking at this data would immediately think, "Are there any products out there that combine both the ability to trade stocks and possess a social element?" (there is, it's Public). 

Transparency

Alternative data can also provide transparency. Traditional datasets, although accurate, can be manipulated to the gatherer's agenda. Traditional datasets on a business are usually gathered, organized, and presented by the same company. That business will usually present that data to create a narrative to show strength in its business & industry. Alternative data can cut right through that and provide transparent context around that narrative since the information was captured by outside sources not affiliated with the company.  

What are the challenges of alternative data? 😰

Ethical & Legal Issues

There are some legal and ethical concerns with companies using alternative data. This issue depends on the type of data you are using. We are still in the early years, but the best way to go about this would be to work with a 3rd party specializing in aggregating alternative data sets. Those same 3rd parties will also have a compliance team to ensure the collected, and aggregated data is within the legal and regulatory framework. You should have a compliance team as well. 

Risk Models

To acquire confidence in your alternative data sets, it's essential to develop and test (and retest) risk models. The execution in developing these risk models needs to be thorough. Developing risk models can be time-consuming in the beginning stages but imperative for a solid forecast.  

Intuition

It is essential to have a strong intuition when you're analyzing alternative data sets. I do not believe that intuition is something that someone is born with. I think it's developed and strengthened through experience and rigorous work in the applied field. That work & development comes from the exercise of constantly asking who, what, when, where, how, & why to the point where you find the answers to those questions so many times your intuition will start to kick in and internally point you to the answer before you get to it. It can be challenging to find people with good intuition; it's also a subjective skill that can be tough to quantify. 

Current growth & the futureπŸ“ˆ

You can see the current growth in alternative data by looking at the spending in the last couple of years; as you can see in the chart below, spending is estimated to go from $232M in 2016 to $1.71B in 2020. That is absolutely bananas 🍌. 

I don't see that growth slowing down anytime soon either. As you will see below, an increasing number of companies are getting involved in the space as a product and a service. Not to mention, the growing improvements in computing capabilities will continue to evolve the industry and provide a more transparent analysis of alternative data sets. Thriving industries such as alternative data will continue to attract more skilled professionals who will contribute their elite skill sets to enhance products and services.

Where can I get alternative data?🧾

Currently, there are three main ways to obtain alternative data; web scraping, acquisition, and third-party licensing.  

Web Scraping

To obtain alternative data through web scrapping, you're going to need to either have a background in computer programming, have a programmer on staff, or contract one out. Programmers create algorithms that can search the web for specific data types extracted in an actionable way to the user. An example of this would be scraping product information from an eCommerce site into an excel spreadsheet. 

Acquisition

Acquiring raw data sets can be a step in the right direction. For most people acquiring and analyzing raw data can be a tall task, especially if you don't have the technical background. Although this is a valuable strategy, investing in a third-party's service might be more beneficial. 

Third-Party Licensing

Using a third-party's services will most likely provide the most value, but it's also the most expensive option. If you don't have a solid technical background but have substantial experience and intuition in your field, this is the best option for you. From my experience in tech, it tends to be more expensive to be cheap. Also, as stated above, an increasing number of alternative data-focused companies are entering the space. Below are just a handful of these companies, their primary focus within the alternative data space, and a link to their website. 

  • Fred - General economic research

  • InfoTrie - Low-latency alternative data API feed

  • Quandl - General data marketplace

  • Quexopa - South American alternative data provider

  • S&P Global - General Marketplace

  • Thinknum - Company-focused datasets

  • Yewno - Knowledge graph data inference 

  • Brian Company - Provides data research through Natural Language Processing and Machine Learning. 

  • Suburbia -  Focus on alternative insights on point-of-sale transactions from scalable sources. 

  • Accern - Focus on intelligent decision-making from web-scrapped data from over 1 billion websites.  

Summary

  • Alternative Data refers to data used to evaluate a company outside those traditional datasets.

  • Alternative data is primarily used by buy-side firms such as hedge funds and mutual funds.  

  • Alternative data is manufactured by individuals, businesses, and tech/sensors.

  • There seems to be an endless amount of alternative data variations.

  • To get the most out of your alternative data, you need industry experts and automated systems in place. 

  • The benefits of alternative data are real-time access; it saves time, influences new investment ideas, and is transparent. 

  • The main challenges of alternative data are ethical concerns and developing proper risk models.

  • The spending on alternative data was ~$1.7B in 2020.

  • The three ways you can obtain alternative data are by web-scraping, acquisition, or licensing it from a third party. 

  • There is an increasing number of private companies entering the alternative data industry all around the world.