Plain Summary
Analysis-Ready Data (ARD) is satellite imagery that has been processed to a standard where it can be used directly for analysis without additional preprocessing. This means the image has been geometrically corrected (pixels are in the right geographic locations), radiometrically calibrated (pixel values represent meaningful physical quantities like surface reflectance rather than arbitrary digital numbers), atmospherically corrected (the atmosphere's distortion has been removed), and often cloud-masked (unusable pixels are flagged). ARD is the difference between receiving raw ingredients and receiving a prepared, measured, recipe-ready mise en place.
Why It Matters
The vast majority of time spent working with satellite data is not spent doing analysis. It is spent getting the data ready for analysis.
A remote sensing analyst who wants to study vegetation health across a region over five years using Sentinel-2 imagery faces this sequence before any actual science begins: download the scenes, check cloud cover, apply atmospheric correction to convert top-of-atmosphere reflectance to surface reflectance, apply geometric correction to align pixels to a coordinate reference system, resample to a common grid, apply cloud and shadow masks, verify radiometric consistency across scenes, and handle any data gaps. For a single scene, this takes minutes to hours depending on tooling. For a time series across a large area, it can take days to weeks.
This preprocessing burden is the single largest barrier to wider use of earth observation data. Not the cost of the data (much of it is free through programs like Copernicus). Not the complexity of the analysis (vegetation indices are straightforward arithmetic on spectral bands). The barrier is the gap between what the data provider delivers and what the analyst needs.
ARD closes this gap. When data is delivered as ARD, the analyst receives surface reflectance values in a known projection with quality flags already applied. They can start computing NDVI immediately. They can compare scenes from different dates because the radiometry is consistent. They can composite across sensors because the data has been harmonized to a common standard.
The Committee on Earth Observation Satellites (CEOS) formalized ARD requirements in their ARD for Land (CARD4L) framework, defining minimum specifications for what constitutes analysis-ready optical and radar data. These specifications cover geometric accuracy, radiometric consistency, atmospheric correction quality, and metadata completeness.
Processing Levels
Satellite data is conventionally described in processing levels that indicate how much correction has been applied:
Level 0 — Raw instrument data. Unprocessed digital numbers as recorded by the sensor. Useful only for specialized calibration work.
Level 1A — Reconstructed, unprocessed instrument data at full resolution with radiometric and geometric coefficients appended but not applied. The data is organized but uncorrected.
Level 1B / Level 1C — Radiometric calibration applied. For Sentinel-2, Level-1C is top-of-atmosphere (TOA) reflectance: the sensor's measurement includes both the surface signal and the atmosphere's contribution. Pixels are orthorectified (geometrically corrected using a digital elevation model).
Level 2A — Surface reflectance. Atmospheric correction has been applied to remove the atmosphere's contribution, yielding an estimate of what the surface actually reflects. For Sentinel-2, this is produced by the Sen2Cor processor. Cloud and shadow masks are included. This is the starting point for most ARD products.
Level 3 — Temporally composited or spatially mosaicked products. Monthly composites, seasonal averages, gap-filled time series. These aggregate multiple Level-2 scenes into summary products.
Level 4 — Derived geophysical variables. Not reflectance but interpreted products: vegetation indices, land cover maps, burned area, water extent. These are the outputs of analysis, not its input.
ARD typically corresponds to Level 2A: surface reflectance with quality flags, in a known projection, ready for direct use. Some definitions extend ARD to include temporal compositing (Level 3), but the core requirement is that the data represents a physically meaningful quantity at the Earth's surface with documented uncertainty.
What Makes Data "Analysis Ready"
CEOS CARD4L defines specific requirements across several dimensions:
Geometric Accuracy
Pixels must be located correctly on the Earth's surface. This requires orthorectification using a digital elevation model and ground control points. Sub-pixel accuracy (better than one pixel's width) is the standard for most ARD products. Without geometric accuracy, time series analysis is meaningless; you would be comparing different patches of ground across dates.
Radiometric Consistency
Pixel values must represent a consistent physical quantity. For optical data, this means surface reflectance: the fraction of incoming sunlight reflected by the surface, with the atmosphere removed. For SAR, this means calibrated backscatter coefficients (sigma nought, gamma nought) that account for incidence angle and terrain effects.
Radiometric consistency across time is essential for change detection. If two scenes of the same unchanged area produce different pixel values because of different atmospheric conditions or sensor calibration drift, any change detection algorithm will produce false positives.
Atmospheric Correction
The atmosphere absorbs and scatters sunlight between the sun, the surface, and the sensor. Aerosols, water vapor, and atmospheric gases all contribute. Atmospheric correction models (such as Sen2Cor for Sentinel-2, LaSRC for Landsat, or 6S radiative transfer code) estimate and remove this atmospheric contribution to recover the surface signal.
This is not a trivial step. Atmospheric correction can change pixel values by 10-30% depending on conditions, and the accuracy of the correction depends on the quality of the atmospheric model and auxiliary data (aerosol optical depth, water vapor content, ozone concentration). Poor atmospheric correction propagates errors into every downstream analysis.
Cloud and Quality Masking
Clouds, cloud shadows, cirrus, snow, and other obstructions must be identified and flagged. These pixels are not usable for surface analysis and must be excluded. Cloud masking algorithms (such as Fmask, s2cloudless, or the Sentinel-2 Scene Classification Layer) are imperfect: they miss thin cirrus, misclassify bright surfaces, and struggle with cloud edges.
ARD includes quality assessment bands or layers that encode per-pixel confidence in the classification. This allows analysts to choose their quality threshold rather than relying on a binary mask.
Metadata Completeness
Every ARD product must include sufficient metadata to understand what it represents and how it was produced. This includes: sensor identification, acquisition date and time, processing chain description, quality assessment information, coordinate reference system, and per-pixel uncertainty where available.
Without metadata, data is orphaned. It cannot be trusted, combined with other data, or traced back to its source when problems arise. See Data Lineage.
The Harmonization Challenge
ARD solves the preprocessing problem for individual sensors. The harder problem is harmonization across sensors.
Sentinel-2 and Landsat 8/9 both produce multispectral imagery of the Earth's surface, but their bands are not identical. Sentinel-2's Band 4 (Red, 665nm center) is not the same as Landsat 8's Band 4 (Red, 655nm center). Their spatial resolutions differ (10m vs 30m), their spectral response functions differ, and their radiometric characteristics differ.
Computing NDVI from Sentinel-2 and NDVI from Landsat for the same location will produce different values, not because the vegetation changed, but because the sensors measure slightly different things. For time series that span both sensors, or for analyses that require combining them for temporal density, this inconsistency is a fundamental problem.
Harmonization goes beyond ARD. It is the process of making data from different sensors comparable: adjusting for spectral response differences, resampling to common grids, cross-calibrating radiometry, and documenting every adjustment. This is the core problem Fabric was built to solve.
The Fabric Connection
Fabric's primary function is producing ARD, and then going beyond it into cross-sensor harmonization.
When Fabric processes a fire detection pattern, it takes raw Sentinel-2 scenes, applies atmospheric correction to derive surface reflectance, computes thermal indices, applies cloud masking, reprojects to a common grid, and produces a harmonized, analysis-ready product with full provenance tracking. The output is not just ARD in the CEOS sense; it is harmonized ARD that can be directly compared with Landsat-derived products, SAR observations, and historical baselines.
Fabric's processing time for this workflow, approximately 16 seconds for what would take 3-6 hours manually in a GIS, demonstrates the practical value of automating the ARD pipeline. The time saved is not a convenience. It is the difference between analysis that happens and analysis that was too expensive to attempt.
Every step in Fabric's ARD pipeline carries provenance: which atmospheric correction model was used, which cloud mask algorithm was applied, which geometric correction parameters were chosen, and what the resulting quality metrics are. This provenance is not metadata appended after the fact; it is generated as part of the processing chain, implementing Observational Grammar's requirement that every claim traces back to its physical basis.
Philosophical Thread
The epistemology of preprocessing. ARD is not just a technical convenience; it is an epistemological stance. The decision to deliver surface reflectance rather than top-of-atmosphere radiance is a claim about what the analyst needs: not what the sensor measured, but what the surface reflected. The atmospheric correction is an interpretation, not a fact, and it depends on models with their own uncertainties.
This connects to Observational Grammar's core principle that every observation is a claim made by a specific sensor under specific conditions. ARD makes those conditions explicit through provenance and quality flags. A system that delivers ARD without quality information is claiming more certainty than the physics justifies.
See also: Epistemic Architecture · The Observer Problem
Related Entries
Data & Architecture: Harmonization · Cloud-Optimized GeoTIFF · STAC · Coordinate Reference Systems · Raster Data Models · Metadata & Discovery · Change Detection
Sensors & Physics: Radiometric Principles · Atmospheric Correction · Spatial/Spectral/Temporal Resolution · Signal-to-Noise & Uncertainty
Satellites & Platforms: Sentinel-2 · Landsat Program · Open Data Programs
Philosophy: Observational Grammar · Epistemic Architecture · The Observer Problem
Security & Provenance: Data Lineage · Confidence Scoring · Provenance Standards
References
[1] CEOS (2023). "CARD4L: CEOS Analysis Ready Data for Land." Committee on Earth Observation Satellites. ceos.org/ard
[2] Dwyer, J.L., et al. (2018). "Analysis Ready Data: Enabling Analysis of the Landsat Archive." Remote Sensing, 10(9), 1363. doi:10.3390/rs10091363
[3] Main-Knorn, M., et al. (2017). "Sen2Cor for Sentinel-2." Proc. SPIE 10427, Image and Signal Processing for Remote Sensing XXIII. doi:10.1117/12.2278218
[4] Vermote, E., et al. (2016). "Preliminary Analysis of the Performance of the Landsat 8/OLI Land Surface Reflectance Product." Remote Sensing of Environment, 185, 46–56.
[5] Frantz, D. (2019). "FORCE — Landsat + Sentinel-2 Analysis Ready Data and Beyond." Remote Sensing, 11(9), 1124. doi:10.3390/rs11091124
Further Reading
CEOS CARD4L Specification — CEOS, 2023 The definitive reference for what constitutes analysis-ready data. Read the optical surface reflectance and normalized radar backscatter specifications.
FORCE Documentation — David Frantz An open-source framework for Landsat + Sentinel-2 ARD generation. Useful for understanding the practical details of atmospheric correction, cloud masking, and temporal compositing.
The Landsat Data Continuity Mission — USGS How the Landsat program maintains radiometric consistency across 50+ years of missions, a masterclass in the long-term challenges of ARD.
Entry DAT-004 · Created February 2026 · Contributors: M33 Team · License: CC BY-SA 4.0