[Marxism] Harvesting data
lnp3 at panix.com
Mon Apr 22 06:27:02 MDT 2019
The Data Race; Fund managers are paying handsomely to sift through
oceans of data on everything from flight patterns to parking lots for
clues on deals and trends that move markets. Tim Shufelt looks at why
sophisticated investors are increasingly turning to Big Data to gain an edge
The Globe and Mail (Canada), April 20, 2019 Saturday, Ontario Edition
Byline: Tim Shufelt, Staff
Last summer, a corporate jet owned by Encana Corp. embarked on a strange
pattern of flights.
Over the course of a couple of months, the Calgary company's aircraft
was tracked landing close to oil fields in Oklahoma, Utah and Montana.
Encana had operations in none of those places. Houston-based Newfield
Exploration Co., however, owned assets in all three locations.
And it turns out that Newfield's jet was also on the move around the
same time. Its executives seemed to take a particular interest in flying
to Denver, which happens to be home to Encana's chief executive officer,
The mutual visits hinted that something was in the works. And in early
November, Encana struck the largest deal in its history when it agreed
to buy Newfield for US$5.5-billion. An aggressive expansion into U.S.
shale represented a dramatic change of course for the Canadian driller,
and the deal caught the market by surprise. When trading started the
next morning, Encana's stock plummeted by 17 per cent.
A select group of investors, however, may not have been quite so
shocked. They had access to the movements of Encana's jet, giving them
strong clues about what the company's executives were up to in the
months before the acquisition.
Quandl Inc., a Toronto-based startup, allows investors to peek in on the
travel habits of companies such as Encana. By stitching together data
from aircraft registries, corporate filings and flight communications,
Quandl can track the movements of thousands of corporate jets around the
world, giving investors a new advance signal of potential marketmoving
Known as "alternative data," this kind of insight into publicly traded
companies is proliferating, as investors and fund managers look to data
science for an edge that will help them beat the market. Jet activity
can foreshadow corporate deals; aggregated credit-card tallies can
reveal consumer trends; satellite imaging can track oil inventories;
information "scraped" from job sites can indicate who's hiring; data
collected by auto insurers can give clues on future car-sales figures.
The vast pools of information being generated by the digital economy
hold the power to better predict what companies will do and how their
stocks will perform.
The computer sophistication and machine learning needed to make sense of
all that information, meanwhile, is quickly evolving. Until recently the
purview of quantitative hedge funds, alternative data methods are
spreading to mainstream investing. In December, Quandl was acquired by
Nasdaq Inc., which runs a data products business in addition to
operating stock exchanges. The deal amounted to an "inflection point for
the industry," wrote Richard Johnson, a vice-president in Greenwich
Associates' market structure and technology group.
Competition among data providers is heating up. In February, Bloomberg
launched a site featuring 20 alternative data sets. "The race to take
alternative data mainstream has now begun in earnest," Mr. Johnson said.
It's a race that Canadian investors have been reluctant to join.
Despite its Toronto address, Quandl has zero Canadian names on the list
of major clients subscribing to its 50-odd data products, which range
from estimates of Tesla sales to industrial-auction results. Outside of
the big pension funds, in fact, it appears few Canadian investors are
dabbling in alternative data.
There are reasons to be cautious. There is a growing discomfort over the
capacity for Big Data analytics to observe the intimate details of
people's lives. Meanwhile, the legalities over data collection and
distribution can be murky, raising concerns over who owns particular
information and who has the right to sell it. There are tough questions
regulators are just starting to grapple with, including whether
sophisticated investors gain an unfair advantage when they have access
to data that is effectively unavailable to the masses.
And yet, wary Canadian investors run the risk of being left behind if
they wait too long. Alternative data will soon be essential to
generating competitive returns, says Tammer Kamel, Quandl's CEO. "It
will become unacceptable to be basing your investment decisions on what
happened a few months ago."
For decades, standard financial data has been the lifeblood of
fundamental investing. Investors glean what they can from whatever
public companies are required to disclose through regulatory filings and
quarterly financial statements. A handful of data providers, including
Bloomberg, Refinitiv and FactSet, have come to dominate the distribution
of that information. (Refinitiv is partially owned by Thomson Reuters
Corp., which is controlled by Woodbridge Co. Ltd., owner of The Globe
and Mail.) In 2018 alone, investors spent in excess of US$30-billion
globally for access to market data and analysis, according to an
estimate by BurtonTaylor International Consulting.
Alternative data means anything considered to be outside the realm of
traditional financial information, but that can yield valuable market or
And it's by no means a newly invented category. Investors have long
hunted for tradable information outside the bounds of financial
reporting. It used to be said that the thickness of U.S. Federal Reserve
chair Alan Greenspan's briefcase could portend monetary policy
announcements. (A big haul meant he was carrying the documentation to
support a rate cut, or so the theory went.) Hedge fund managers have
also been known to directly observe retail foot traffic or cross-border
shipping or executives appearing in certain airports - anything to get a
read on what, or how well, a company is doing at that very moment.
What has changed is the sheer volume of data now being produced,
everywhere. The internet has more than 1.5 billion live sites.
Facebook users create about 3.3 million posts a minute. The Internet of
Things is connecting everything from cars to household appliances, and
smartphones are constantly tracking their users' locations. By next
year, roughly 1.7 megabytes of data will be generated each second for
every person in the world.
Most commercial information is simply "exhaust" - a byproduct of a
company's main business, Mr. Kamel says. But there is an active market
for those companies to turn their data into revenue.
Many telecommunications companies give third parties access to user
location data for a fee; financial intermediaries will compile
credit-card transaction data; and policy information from auto insurers
can reveal which models of cars are selling best. "You can find out
almost anything you want to know about a stock or a commodity or a
consumer, if you connect to the right database," Mr. Kamel says.
"Somebody's taking that measurement."
That's where alternative data providers come in, typically licensing
that information and turning it into data sets marketed to big hedge
funds and asset managers. There are currently more than 400 providers
like Quandl, up from around 100 a decade ago, according to
"Web scraping," or data extracted from websites, is the largest
subcategory. Using data scraped from Best Buy Co. Inc.'s website, for
example, New York-based startup Thinknum showed robust sales in Amazon
products, such as its Alexa-powered smart speakers, starting around
Black Friday last year - nearly one month before Amazon.com Inc.
announced record holiday sales for its devices.
Research firm Opimas estimates that hedge funds and asset managers
scraping sites for investment purposes accounted for 5 per cent of all
web traffic last year.
So-called sentiment data scraped from social media, financial news and
online forums are among the more established alt data products.
Toronto-based Buzz Indexes built a model that scours sites like
StockTwits and Twitter for insight into how investors feel about
A natural language processing algorithm looks for signs of investor
positivity toward U.S. large-cap stocks and calculates a sentiment score
for each name. The 75 stocks with the highest scores are included in the
Buzz NextGen AI US Sentiment Leaders Index, which, back-tested to the
start of 2013, has returned an average of 17 per cent annually, compared
to 11 per cent for the S&P 500 index. Not too long ago, the idea that
there might be wisdom in the online conversations of investors was met
with cynicism, says Buzz Indexes founder Jamie Wise. "Today, there's
probably not a CIO at any major asset manager on the continent that
isn't thinking about an alt data strategy."
Many have progressed well beyond the thinking stage. BlackRock uses an
active quantitative approach in its Advantage funds, which search for
investment signals from a range of data sources, including weather
patterns, travel-site bookings and employee reviews from sites like
Glassdoor.com. "Combining millions of responses can indicate a company's
state of health, as those with happy employees tend to outperform their
competitors," BlackRock said in a recent brochure for its Advantage
funds. Meanwhile, Franklin Templeton Investments recently signed a deal
with platform company Elsen, giving traditional portfolio managers
easier access to big sets of data.
For large investors and asset managers, getting access to market and
company intelligence that gets as close as possible to real-time data is
worth paying good money for. Quandl's data sets range in price from
US$25,000 to US$250,000 a year. Other products on the market, like
specialized satellite intelligence, can cost upwards of US$1-million a year.
A pair of professors from the University of California at Berkeley
recently demonstrated how satellite images' predictive power can justify
such an extravagant price tag. They looked at daily images of the
parking lots of major U.S. retailers, including Walmart and Target, over
a six-year period to identify whether counting car traffic could help
predict earnings and stock movements. A trading strategy built on buying
shares in retailers with abnormally high parking lot traffic, and
shorting those with low traffic, would have paid off handsomely once
earnings were announced, the analysis found. Compared to a buy-andhold
approach, that satellite-informed portfolio generated average excess
returns of 4.6 per cent.
Silicon Valley-based Orbital Insight is one of the leaders in using
satellite technology to spot tradable economic or company data in real
time, mostly in the consumer and energy sectors. Last September, RBC
Capital Markets announced a partnership with Orbital that provides the
investment bank with geospatial data to include in its equity research.
An RBC report from January said Orbital's images of storage tanks
pointed to declining global crude oil inventories from their Decem-ber
peak - one sign the market could be tightening and prices headed higher.
"Everybody's trying to get into the alternative data space," says
Fardeen Khan, head of strategic initiatives at RBC Capital Markets.
But he adds that it's not a standalone investment approach. The idea
behind RBC's arrangement with Orbital, as well as the bank's other data
science endeavours, is to complement the fundamental and technical
process. "When you look at alternative data as a standalone, the
insights are not sufficient to say you should go fully long on this
company or go short on a specific name," says Mr. Khan.
That sentiment is echoed by Ron Mock, CEO of Ontario Teachers' Pension
Plan, which uses some data-driven trading strategies and is "leveraging
the deep insights it's capable of bringing," he said during a discussion
at the World Economic Forum in Davos, Switzerland, in January. "We have
to be very, very mindful, that we can't push it so far that we turn our
Nearly a decade ago, hedge funds were the only ones most willing to take
a chance on such an exotic, untested idea as using alternative data, and
they have been the main driver behind the growth of that industry.
Global spending on alternative data sets is currently about US$3-billion
per year, according to JP Morgan, a small fraction of the size of the
conventional data business.
For the industry to assume a larger profile, it will need to extend its
appeal to more traditional asset managers. Now is a good time to do just
that, Greenwich's Mr. Johnson says. "A lot of active managers are
struggling to beat passive benchmarks, and they're looking for a new
edge," he says.
The passive investing craze has made life difficult for traditional
active managers. Franklin Templeton, for example, saw its global assets
under management decline by 14 per cent last year.
The average active fund manager, however, has very different data needs
than a giant U.S. quant fund. Without the infrastructure to analyze raw
data, most fundamental investors require data that have been ingested,
formatted and packaged, or fed into platforms they can incorporate into
their own investment processes.
Quandl's corporate-jet-tracking app is one example of this.
The idea for an aviation-based investment tool came out of a hedge-fund
trade from early 2017.
A trio of New York-based funds figured out how to track Johnson &
Johnson's Gulfstream jet on the internet and found it sitting on the
tarmac at a Swiss airport for more than a week, just a few kilometres
away from the headquarters of pharmaceutical company Actelion Ltd.
Convinced a major tie-up was being negotiated, the hedge funds loaded up
on Actelion shares, which soared when J&J announced a US$30-billion deal
to acquire the Swiss company a few days later. When Abraham Thomas,
Quandl's chief data officer, read about that payday, he thought: "What
if we could automate that process?" By combining flight location data
and ownership information from several different sources, Quandl can now
track the flight activity for a majority of the companies in the Russell
Most of those companies, however, would prefer to keep that information
to themselves. They'll often try to conceal their own aircraft through
subsidiaries and holding companies, or complex leasing arrangements. By
poring over aircraft registrations, operator licences, public filings
and corporate parent-subsidiary relationships, Quandl has built a
database of 29,000 jets and counting.
Most subscribers are using the product as one part of an M&A investing
strategy, to help shed light on rumours or suspected deals, Mr. Thomas
says. Others use it for protecting their short positions.
Investors betting against a stock are vulnerable to that company being
acquired, since such an announcement typically results in a big jump in
share price - and big losses for short sellers. "On other occasions,
activist hedge funds want to find out if the CEO is gallivanting around
the world on the company dime," Mr. Thomas says.
Corporate jet data is one of dozens of data sets that Quandl says puts
it in the alternative-data lead, Mr. Kamel says. Already, the company
leads the industry in brand recognition, according to a recent Greenwich
Associates study. And being acquired by Nasdaq represents a huge boost
to the company's profile and credibility.
"When you hand someone a card that says Nasdaq - part of the fundamental
structure of capital markets - that helps a lot," Mr. Kamel says. "Now
we're standing on the shoulders of a giant."
So far, Canadian hedge funds have taken a pass on alternative data -
almost all of them, in fact, according to Claire Van Wyk-Allan, head of
the Canadian chapter of the Alternative Investment Management
Association. Most Canadian players are just not big enough to justify
the cost. The hedge fund industry here pales in comparison to behemoths
on the other side of the border. Bridgewater Associates, for example,
manages about US$160-billion, while only a handful of Canadian hedge
funds surpass even the US$1-billion mark.
"We are not able to spend $100,000 every month on all kinds of data. We
aren't Bridgewater," says Ernest Chan, who runs QTS Capital Management
in Niagara-on-the-Lake, Ont., and manages a small hedge fund. His own
experience with alternative data suggests it generates one to two
percentage points of "alpha," or excess returns. For Bridgewater, that
would amount to a boost to annual returns of US$1.6-billion to
US$3.2-billion. "But if you are $100-million fund, alternative data is
not necessarily a must-have," Mr. Chan says.
The major Canadian pension funds, on the other hand, are certainly big
enough to use algorithmic trading strategies and advanced data
analytics. "If you go to a quantitative investment conference, it is
dominated by pension plans," Mr. Chan says. Canada Pension Plan
Investment Board, Ontario Teachers' Pension Plan and Alberta Investment
Management Corp. all declined to comment on how they're using
alternative data in their investment decisions.
Canada's big asset managers, meanwhile, appear to be on the sidelines
when it comes to alternative data. While Quandl boasts of having 14 of
the world's 15 largest asset managers as customers, the company has yet
to land a big Canadian name. "It's a little frustrating that it's easier
for me to sell in New York than in my own backyard," Mr. Kamel says.
Though Toronto has emerged as a global fintech hub, Bay Street asset
managers seem reluctant to evolve.
"Canadians just might be used to doing things the regular old way," Ms.
Van Wyk-Allan says.
Part of the hold-up might be in how alternative data is typically
marketed. "Looking at this data as a source of alpha is like entering
into a nuclear arms race, where you're constantly in search of the next
data set," says Ashby Monk, executive director of Stanford University's
Global Projects Center, which studies, among other things, how
technology can improve long-term investing. "If you're a patient
investor, that's probably not the best use case for alternative data."
Instead, he suggests it can be used to better understand risk in a
portfolio, to assist in due diligence and to make better capital
Alternative data is by no means risk-free, however. It's an unregulated
space that lacks legal clarity.
While insider trading generally has a fairly narrow legal definition,
some alternative data strategies are starting to look awfully similar.
The common regulatory test for insider trading asks whether a piece of
information is material and non-public. Alternative data certainly has
the power to be material to a company's stock, by providing timely
indicators of a company's health. And if distributed only to a limited
number of investors, or even exclusively to a single hedge fund, it can
be difficult to argue that data is public.
"The line between public and material non-public information is key
here," says Kirsten Thompson, a partner at Dentons and national lead of
the law firm's transformative technologies and data strategy group.
"Securities regulation requires you to know which side of that line you
are on, and alternative data makes that difficult." Some large hedge
funds are known to avoid buying "exclusive" data sets for fear of legal
There is growing alarm, meanwhile, around the sharing of personal data
by companies that compile it. Last year, it was discovered that location
information on U.S. cellphone users sold by telecom providers was ending
up in the hands of bounty hunters. Canadian telcos also sell location
data to third parties, which they say is only done with users' explicit
Canadian privacy legislation regulates the personal information that
corporations collect, limiting the ways it can be shared.
And in any case, investors are not interested in obtaining anyone's
personally identifiable information, Quandl's Mr. Thomas says.
"We tell our vendors, if you have personal records, don't even send us
the data. We don't even want it to touch our servers."
But it can be difficult to truly anonymize certain information.
"Our machine learning and our algorithms are now getting so
sophisticated that you may have data that you think is not identifiable,
but the algorithm can, in fact, identify somebody," Ms. Thompson says.
Data science can also be used to approximate protected data. Access to
Canadian credit scores is limited, for example, but a fairly accurate
estimate can be computed from other sources, including social media.
Investors considering accessing alternative data need to conduct their
due diligence, Ms.Thompson says. "There's a constellation of questions
you should be asking, including the sources of data, appropriate
consents, the genesis of the information." Data harvesting methods like
web scraping can violate the terms and conditions set out on a website's
fine print, in which case, data vendors would be prevented from selling
Quandl says it is offered about 100 datasets each month by companies and
data hunters looking to cash in on the data they have collected. In
addition to evaluating the data quality and its potential value to
investors, the company says it takes pains to trace each potential data
set back to its original source. "Our customers understand we've vetted
the data for personal information, insider information and ownership
issues," Mr. Thomas says.
Outside of Canada, adoption of alternative data is growing fast, despite
the regulatory limbo. Canadian investors won't have the luxury of taking
a cautious approach much longer, Mr. Thomas says. "As it spreads wider
and wider, if I don't have this data, I'm at a disadvantage. It becomes
More information about the Marxism