
Ships, satellites, harvests, footsteps. Every alternative data provider, catalogued with coverage, history, delivery, and price. Free to read.
Every chapter opens with a primer — what the data is, what it predicts, what it costs — followed by the full provider register.









…and nine further chapters, from app usage to on-chain data. Browse the full register
343 providers listed, 207 verified to date — including a dedicated chapter of free and open sources. Verified means the site is live, on its own brand, and editorially checked — never paid placement. Coverage grows over time; listings are informational, not investment advice.
Public and open-source sources — free to access, if not always easy to. Satellite archives, regulatory filings, on-chain explorers, and official statistics.
Consumer spend panels, ticker-mapped weeks ahead of reported revenue.
Transaction data begins as a swipe. Somewhere upstream of the settlement system, that event is captured, anonymised, and aggregated with millions of others into a panel covering tens of millions of deidentified accounts. The most sophisticated panels map those accounts to individual company tickers, giving a buyer a real-time approximation of a retailer's revenue weeks before the company reports it. Buyers should weigh panel representativeness, the depth of history, and the compliance posture around personally identifiable information.
SKU-level prices, assortment, and traffic scraped daily across thousands of retailers.
Web data is collected by crawling public websites at scale: product pages, price points, stock status, and the structure of an online catalogue. Specialist firms run the infrastructure that fetches, parses, and normalises these pages into clean panels of prices and assortment. Investors use it to track pricing power, promotional intensity, and out-of-stock rates across competitors. The main considerations are coverage breadth, the legality and robustness of collection, and how cleanly the data maps to companies and brands.
Entity-level sentiment scored from news, filings, and earnings calls in real time.
News and sentiment data turns unstructured text into structured signals. Providers ingest news wires, regulatory filings, and earnings-call transcripts, then resolve the entities mentioned and score tone, novelty, and relevance in real time. Quant teams fold these scores into short-horizon models; fundamental teams use them to monitor event flow across a watchlist. Key buyer questions are entity-resolution accuracy, latency, language coverage, and the length of the back-tested history.
Store visits and movement panels, resolved at the individual location level.
Foot-traffic data is built from mobile-device location signals, sampled from apps and ad exchanges, then cleaned into visit counts at specific points of interest. Aggregated to the store, brand, or chain level, it estimates how many people walked into a venue and how that is trending. Investors use it to nowcast same-store sales and to gauge the pull of a new format or location. Privacy compliance, panel stability over time, and the quality of place-matching are the decisive considerations.
Emissions, controversies, and physical climate risk, measured independently of issuer disclosure.
ESG and climate data measures environmental, social, and governance characteristics independently of what an issuer chooses to disclose. Providers combine company reporting, regulatory records, news controversies, and physical-risk modelling into scores and raw indicators. Investors use it for screening, risk management, and increasingly for regulatory reporting. Buyers should scrutinise methodology transparency, how scores are revised over time, and the degree to which ratings from different vendors actually agree.
Imagery with ML detections: ships in ports, cars in parking lots, crop health from orbit.
Satellite and geospatial data starts as imagery captured from orbit, then becomes useful once machine-learning models detect the objects that matter: cars in a parking lot, ships in a port, oil in a storage tank, or the health of a field. The result is a measured time series of physical activity, often available before any official figure. Investors use it for commodity supply, retail traffic, and industrial output. Revisit frequency, cloud cover, detection accuracy, and history depth drive the value.
Vessel tracking, port congestion, bills of lading, and container-level freight flows.
Supply-chain data tracks goods as they move through the world: vessels reporting their position, containers passing through ports, and customs records that name the shipper and consignee on each bill of lading. Stitched together, these sources reveal who is buying what from whom, and where freight is congested. Investors use it to read demand for raw materials, to map a company's suppliers, and to anticipate shipping rates. Coverage by trade lane, the lag on customs filings, and entity matching are the core considerations.
Job postings, headcount signals, and hiring velocity across millions of employers.
Workforce data is assembled from job postings, professional profiles, and payroll-adjacent signals, then resolved to individual employers. It shows where a company is hiring, how fast its headcount is changing, and which skills it is competing for. Investors read it as a leading indicator of expansion, cost pressure, and strategic pivots, and to map talent flows between competitors. The questions that matter are how postings are deduplicated, how profiles are kept current, and how cleanly the data ties back to public tickers.
Crop yield nowcasts, climate risk indices, and commodity supply signals, field by field.
Weather and agriculture data fuses meteorological observations, soil and satellite measurements, and agronomic models into estimates of crop condition and likely yield. The output ranges from field-level analytics to country-scale production nowcasts. Investors use it to anticipate harvests, model commodity supply, and price weather risk well ahead of official crop reports. Buyers should consider geographic coverage, the agronomic rigour behind the models, update frequency, and how far the history extends across past growing seasons.
DAU, MAU, retention, and revenue estimates for mobile apps globally.
App-usage data estimates how many people download, open, and pay within mobile applications, modelled from device panels and store signals. It yields download counts, active-user trends, retention curves, and revenue estimates at the app and publisher level. Investors use it to track engagement for consumer-internet and gaming names well before quarterly disclosure. The decisive factors are panel size and geographic mix, the accuracy of the modelling against known truth, and how methodology changes affect comparability over time.
Purchase-level data extracted from anonymised email inboxes at panel scale.
Email-receipt data is parsed from the order confirmations that land in consumers' inboxes, sourced from panels who have consented through mailbox apps and tools. Because a receipt itemises what was bought and at what price, the data resolves to specific merchants and even product lines. Investors use it to track e-commerce demand, basket size, and subscription churn. The considerations are panel consent and representativeness, parsing accuracy across merchants, and the survivorship of the panel through app and policy changes.
Loan origination trends, delinquency signals, and private credit market flows.
Credit and lending data captures the formation and performance of loans: originations, balances, delinquencies, and the structure of securitised pools. Providers aggregate servicer feeds, public records, and structured-finance disclosures into clean panels. Investors use it to read consumer health, to underwrite structured products, and to monitor private-credit flows that never appear in headline statistics. Buyers should assess loan-segment coverage, the lag and completeness of servicer reporting, and how borrower information is anonymised.
Parsed regulatory filings, court records, lobbying disclosures, and government contract awards.
Public-records data takes documents that are technically open but practically hard to use — regulatory filings, court dockets, lobbying disclosures, and government contract awards — and turns them into structured, searchable datasets. Providers handle the collection, parsing, and entity resolution that make the records analysable at scale. Investors use them to track litigation risk, government demand, and the fine print inside filings. The key questions are completeness across jurisdictions, parsing accuracy, update latency, and how reliably records tie to public companies.
Primary research: one-on-one calls with industry practitioners and quantified survey panels.
Expert networks and survey panels supply primary research rather than passively collected exhaust. Networks broker one-on-one calls with industry practitioners, while survey platforms field structured questionnaires to defined audiences and quantify the answers. Investors use them to test a thesis, fill an information gap, and triangulate against other data. The central considerations are compliance — managing material non-public information and conflicts — alongside expert quality, panel representativeness, and the controls around how insights are sourced and recorded.
Industrial sensor feeds, energy consumption signals, and connected-device telemetry.
Sensor and IoT data is the telemetry thrown off by connected devices and meters: energy consumption, industrial equipment status, and utility usage at fine time resolution. Providers aggregate these feeds, often with customer consent, into panels that proxy real economic activity. Investors use them to track industrial output, energy demand, and utility trends ahead of official series. Buyers should weigh consent and coverage, the representativeness of the device population, and the engineering needed to turn raw telemetry into a clean signal.
Exchange flows, wallet clustering, and market microstructure for digital assets.
On-chain data reads the public ledgers of blockchains directly: transfers, exchange inflows and outflows, wallet clustering, and the behaviour of large holders. Providers label addresses, reconcile activity across chains, and add exchange microstructure on top. Investors use it to gauge accumulation and distribution, network usage, and stablecoin flows in a market that runs around the clock. The considerations are labelling accuracy, chain and exchange coverage, and how a provider handles the constant churn of new protocols.
Claims data, prescription flows, clinical trial signals, and patient journey analytics.
Healthcare data spans medical and pharmacy claims, prescription flows, electronic health records, and clinical-trial activity, all de-identified to protect patients. Providers normalise these sources into longitudinal views of how treatments are prescribed and how patients move through the system. Investors use them to track drug uptake, procedure volumes, and the progress of trials. Buyers must weigh patient-privacy compliance, the completeness of claims capture, the lag in the data, and how dependably it maps to specific therapies and manufacturers.
Pre-IPO revenue estimates, VC funding flows, and private company operational metrics.
Private-markets data shines a light on companies that do not file public accounts: venture and buyout funding rounds, investor and fund performance, and operational signals scraped or modelled for pre-IPO names. Providers assemble deal databases and layer on web and hiring exhaust to estimate momentum. Investors use them to source deals, benchmark valuations, and track companies on the path to listing. The considerations are dataset completeness, the basis behind any modelled estimates, update cadence, and the inevitable gaps where private firms simply disclose nothing.
The alternative data market has grown into a multi-billion-dollar industry, yet its map remains private property. The established guides are excellent — and expensive, and closed. An analyst at a mid-sized fund still cannot answer a simple question without a sales call: who sells what, and roughly at what price?
This atlas takes the opposite position: the register is open to anyone, free to read. Vendors are welcome to claim and correct their entry by writing to us; in time they will be able to file samples and due-diligence documents for buyers. Any future revenue will come from richer vendor profiles — never from gating the map itself.