Treating Data Like a Product

How DaaP, Data Products, and SaaS Thinking Come Together (and Where DataCore Fits)

Why this conversation matters now

Over the last decade, most organizations have quietly become data factories.

They log every click, payment, shipment, interaction, and error. They accumulate:

years of transaction histories
customer journeys across channels
operational telemetry from applications and devices
market, macro, and regulatory datasets from outside

If “data is the new oil,” then many companies now sit on top of massive reserves.

And yet:

Leadership still argues in meetings with PowerPoint, not data.
Reports are brittle: one schema change and everything breaks.
AI pilots look great in POCs but never become operational products.
Almost nobody can say, “These are our core data products, here is their roadmap, here is their P&L impact.”

And we can put it nicely:

They’re rich in data, but poor in data products.

At the same time, surveys and practitioner reports show a consistent pattern¹:

Top 3 uses of data in practice
1. Informing strategic decision-making
2. Improving operational efficiency
3. Enhancing customer service
Bottom 3 in actual adoption (but top in hype)
- Machine learning (ML)
- Artificial intelligence (AI)
- Direct data monetization

And when organizations try to implement a Data as a Product (DaaP) strategy, the biggest reported challenges are:

Integrating data products into existing workflows and systems
Ensuring data quality and reliability
Aligning data products with business value and goals

So the central question becomes:

How do we move from “data exhaust” to data products, and from accidental, ad-hoc use of data to a deliberate Data-as-a-Product operating model?

That’s what this piece is about.

We’ll walk through:

Clear definitions
- Data products
- Data as a Product (DaaP)
- Data as a Service (DaaS)
- How all of this compare to classic SaaS
Foundations
- Data mesh and product thinking
- The core properties of a good data product
Operating model
- Ownership, lifecycle, SLAs, and self-service
- Why data quality & observability are product problems
Monetization & ecosystem
- Turning DaaP into DaaS and revenue streams
How DataCore fits
- Concrete ways DataCore can help organizations in Vietnam treat data like a product: combining rich datasets with HPC and AI infrastructure.

1. Key definitions: data product, DaaP, DaaS, SaaS

Let’s start with a clean vocabulary. A lot of confusion comes from people using the same words for different ideas.

1.1 What is a data product?

A data product is a delivered unit of value built on data that solves a specific business problem for a specific set of users.

It can be:

a curated table (e.g., a Customer 360 table)
a dashboard (e.g., a risk or marketing performance dashboard)
a machine learning model (e.g., churn scores, credit scores)
an API that exposes scores, recommendations, or micro-forecasts

dbt Labs describes a data product as a “data container or unit of data that solves a business problem and includes the metadata, pipelines, contracts, and documentation needed to produce and use it.” dbt Labs

In other words, a data product is more than:

“Here’s a table. Good luck.”

It has product-like properties:

Discoverable
- Others can find it (via catalog, registry, naming conventions).
Addressable
- It has a stable identifier or endpoint: a schema, URL, or API.
Trustworthy & observable
- Consumers can see where it comes from, how fresh it is, and whether it’s healthy.
Self-describing
- It carries documentation: what fields mean, how metrics are computed, who owns it, which versions exist.
Interoperable
- It uses shared standards (schemas, formats, keys) so it can be joined, reused, and embedded.
Secure & governed
- Access rules, masking of sensitive columns, and audit trails are part of the product spec.

A random query someone wrote once is not a data product.
A curated, documented, versioned, governed Customer Profitability Mart v2 is.

1.2 What is Data as a Product (DaaP)?

“Data as a Product” is not just another buzzword for “data product.”

It’s an operating philosophy:

DaaP = treating datasets themselves as enduring products, with clear users, owners, roadmaps, quality standards, contracts, and lifecycle management.

So if a data product is the thing,
Data as a Product (DaaP) is the way you design, build, and run those things.

Dataversity describes DaaP as a methodology that views data as a stand-alone product, focusing on its value, quality, and ability to meet stakeholder needs, originally emerging from the data mesh movement. Dataversity

Common ingredients of DaaP:

Product-like management
- There’s a backlog, roadmap, and prioritization process for key datasets. dbt Labs
Interfaces & contracts
- Schemas and APIs are treated as contracts; breaking changes are versioned and communicated. dbt Labs
Versioning & lifecycle
- Datasets have v1, v2, deprecated versions, not just “whatever’s in the warehouse today.”
Access rules as first-class
- Access levels and privacy constraints are part of the product spec, not an afterthought.
Data consumers = customers
- Analysts, data scientists, applications, even external clients are understood as customers whose needs shape the product.

Netflix’s data engineering team frames this as: treat each important dataset, metric, and model as a product with clear purpose, audience, ownership, lifecycle, and quality expectations, not as an incidental by-product of systems. Medium

1.3 What is Data as a Service (DaaS)?

This is related but different.

Data as a Service (DaaS) is about how data is delivered and commercialized:

Providing access to data on demand over the network (usually via APIs, feeds, or bulk exports), often as a paid or managed service.

Examples:

Market data providers serving real-time price feeds
Credit bureaus exposing credit scores and reports via APIs
Location or weather providers offering usage-metered data APIs
B2B platforms selling curated industry benchmark datasets

Acceldata summarizes it neatly: DaaS is typically a bespoke method of selling external data, whereas DaaP is about viewing your internal data ecosystem as a product to be designed and managed. Acceldata

In practice, robust DaaS offerings sit on top of DaaP. You need productized, high-quality datasets before you can safely and credibly sell or expose them.

1.4 How does this compare to Software as a Service (SaaS)?

Everyone understands SaaS: software delivered over the internet, managed by the provider, typically paid on subscription.

Think of DaaP and data products as taking SaaS discipline and applying it to data.

Here’s a simple comparison:

Aspect	SaaS	Data as a Product (DaaP) / Data Products
Core unit	Application / feature	Dataset / data product
Main value	Functionality, workflows	Insight, decision support, automation
Primary users	End-users (operators, customers)	Data consumers (analysts, apps, ML models, partners)
Interface	UI + API	Schemas, APIs, catalogs, dashboards
Lifecycle	Releases, patches, feature roadmaps	Versions, schema changes, SLOs, data contracts
Business model	Subscription per seat/org	Internal chargeback, DaaS fees, embedded value in other products
Quality focus	Uptime, bugs, UX	Freshness, completeness, accuracy, explainability, lineage

In fact, dbt Labs explicitly describes DaaP as “treating data more like software: a defined, versioned code block with clearly defined ownership, purpose, and documentation.” dbt Labs

So you can think of it like this:

SaaS = productized software
DaaP = productized datasets
DaaS = delivery & commercial model for those datasets

2. Foundations: data mesh, product thinking, and domain ownership

To understand where DaaP came from, it helps to look at data mesh.

2.1 Data mesh in one paragraph

Data mesh, formalized by Zhamak Dehghani and later detailed in her O’Reilly book, is a decentralized data architecture approach that tries to fix the scaling problems of centralized lakes & warehouses. O’Reilly Media

It’s founded on four principles:

Domain-driven data ownership
Data as a product
Self-serve data platform
Federated computational governance

Instead of a single central team owning “all data,” domain teams (e.g., Lending, Retail, Logistics) own the data they generate and publish data products to the rest of the organization.

DaaP is essentially principle #2 in action.

2.2 Product thinking applied to data

Product thinking says:

Start from user and problem, not from technology.
Design for usability and differentiation, not just completeness.
Manage lifecycle deliberately: launch, iterate, retire.
Measure success and adjust.

When you apply this to data:

You stop building “just another pipeline.”
You start asking:
- Who will use this dataset?
- What decision or process does it support?
- What does success look like for them?
- How will we know if this dataset is still valuable a year from now?

Netflix’s Data Health initiative is a nice example: they don’t just track impact vaguely; they actively manage the health, complexity, and standards of their data products to reduce data debt and keep data usable for future AI and analytics. Medium

2.3 Domain teams as data publishers

In a DaaP mindset:

Domains (e.g., Retail Banking, SME Lending, E-commerce, Supply Chain) are responsible for publishing and maintaining the data products that describe their world.
Central platform teams provide tooling, standards, and infrastructure – not every data product themselves.

This distributes ownership while maintaining coherence through shared contracts and governance.

3. The anatomy of a good data product

Let’s zoom in. What distinguishes a “real” data product from a random dataset?

3.1 Discoverable

The product is listed in a catalog or registry (Collibra, data catalog, dbt docs, internal portal, etc.).
It has a clear name and description.
You can search by business terms (“loan default rate,” “active merchants in HCMC”) and find it.

Why it matters: if people can’t find it, they reinvent it badly or assume “it doesn’t exist.”

3.2 Addressable

It has a stable technical identifier (schema.table, S3 path, API endpoint).
There is a documented way to access it (SQL, REST, GraphQL, etc.).

Why it matters: copy-pasting “that query someone shared in Slack” is fragile and unscalable.

3.3 Trustworthy & observable

Consumers can:

Inspect lineage – where did this data come from? through what transformations?
Check freshness – how up to date is it?
Know SLOs – what uptime/freshness is promised?
See quality signals – tests on volume, null rates, distributions, referential integrity, etc.

Monte Carlo Data and others frame this as data observability: bringing modern monitoring practices to data pipelines so you can catch issues early and maintain trust. Medium

3.4 Self-describing

A good data product answers, within itself:

What does each field mean?
What business concepts does it represent? (e.g., what is an “active user”?)
How are key metrics computed?
What are known limitations or caveats?

This is usually done through:

Embedded documentation (dbt docs, catalogs)
Data dictionaries
Example queries and usage patterns

Without this, data products become tribal knowledge and don’t scale beyond a few experts.

3.5 Interoperable

Interoperability can be achieved by:

Using common IDs and keys (customer_id, merchant_id, etc.).
Adhering to standard formats (ISO dates, currency codes, etc.).
Providing APIs or export formats that other systems understand.

This allows downstream teams to:

Join products together (Customer 360 + Transactions + Risk).
Embed them into further data products (e.g., feeding ML models).

3.6 Secure & governed

Data products must:

Enforce role-based access control.
Clearly identify sensitive fields (PII, financial data, healthcare data).
Implement measures to comply with regulations (GDPR, HIPAA, local privacy laws).

Again, this is a product concern, not “security’s job later.”

3.7 Lifecycle managed

Finally, a data product is not immortal by default.

It has versions (v1, v2, v3).
There is a deprecation process when it’s superseded.
There is someone responsible for maintenance and retirement.

This is where much of the “data swamp” comes from: old tables and dashboards that nobody owns anymore but still drive decisions.

4. Deep dive: DaaP as an operating model

Now that we’ve looked at a single data product, let’s zoom out. What does it mean to run your organization’s data function as DaaP?

4.1 From “data projects” to “data product portfolio”

Old mental model:

“The business requests a dashboard / dataset.”
Data team builds it as a project, ships it, and moves on.
Months later, something breaks; nobody quite remembers the context.

DaaP mental model:

“We manage a portfolio of data products.”
Each product has:
- an owner
- customers (data consumers)
- a roadmap and KPIs
New requests are prioritized as changes or additions to existing products, not isolated one-offs.

4.2 Clear roles: data product managers and domain teams

Industry leaders (Uber, Convoy, Netflix, etc.) have started to formalize the role of data product managers – people whose job is to treat datasets as products and internal data users as customers.

They work with:

Domain teams (e.g., Lending, Retail, Logistics) who own the raw data and domain logic.
Data engineers and analytics engineers who build the pipelines and models.
Governance teams who maintain standards and compliance.

Key responsibilities:

Understand data consumers’ needs.
Prioritize what data products to build and how to evolve them.
Define SLAs, contracts, and adoption goals.
Measure usage and impact.

4.3 Self-service platforms: the engine of DaaP

DaaP relies heavily on a self-serve data platform, a core idea in data mesh.

Instead of:

Central data team manually fulfilling every request,

you get:

Standardized tooling (ingestion, transformations, catalog, quality, security).
Domain teams using that tooling to create and maintain their own data products.
Discovery and access managed via catalog and policy, not email and spreadsheets.

This is where platforms like dbt, modern data catalogs, and observability tools come into play.

4.4 SLAs, SLOs, and data health

Just like SaaS products commit to uptime and latency, DaaP pushes data teams to define:

SLAs (Service Level Agreements) – e.g., daily refresh by 6am, >99% uptime.
SLOs (Service Level Objectives) – internal targets for freshness, completeness, error rates.
SLIs (Service Level Indicators) – the metrics that track those (e.g., time since last load, % rows failing quality tests).

Netflix’s Data Health initiative is a good example of this mentality in action: they treat healthy data as a prerequisite for AI and downstream innovation, not as a nice-to-have. Medium

4.5 Organizational structures: centralized, embedded, and hub-and-spoke

Organizations typically evolve through:

Centralized data team
- Pros: coherence, standardization.
- Cons: bottlenecks, poor domain understanding.
Decentralized / embedded analysts & engineers
- Pros: domain expertise, speed for local priorities.
- Cons: duplication, inconsistent standards, silos.
Hub-and-spoke (common in DaaP)
- Hub: central data platform team (tools, governance, quality).
- Spokes: domain data teams that own their data products using that platform.

This hub-and-spoke model fits DaaP well: it balances autonomy with consistency.

5. DaaP vs SaaS: what we can copy (and what we can’t)

Thinking in SaaS terms helps make DaaP concrete.

5.1 Patterns worth copying from SaaS

Roadmaps & backlogs
- Treat new data needs as features or enhancements.
- Prioritize based on impact, not loudest voice.
Customer discovery & feedback
- Interview your analysts, data scientists, and business users.
- Watch how they actually use existing data products.
- Adjust design based on real pain points.
Versioned releases
- Release v1 quickly, then iterate.
- Use semantic versioning for schemas and APIs (v1, v1.1, v2).
Metrics & adoption
- Track usage: queries per day, number of users, dependency graph.
- Track impact: time saved, revenue influenced, incidents reduced.
SLAs and reliability focus
- Make uptime, freshness, and correctness part of the product promise.

5.2 Where data is different from classic SaaS

But there are important differences:

Data products compose more deeply than SaaS apps.
- A single broken upstream table can affect dozens of downstream products.
Regulatory and ethical constraints are heavier.
- Data may contain PII, financial, or health information.
- You need clear boundaries for what can be shared, sold, or used in AI models.
Ownership is more tangled.
- Many datasets have multiple stakeholders and overlapping interests.

So DaaP can borrow SaaS discipline, but it must also:

Handle lineage and blast radius analysis.
Encode privacy and governance into the product definition.
Support multi-tenant usage patterns inside an org (many domains using the same product differently).

6. Data monetization: when data itself becomes the product

Now, let’s connect this to revenue.

A DaaP approach is often the necessary foundation for data monetization and DaaS.

6.1 Common data monetization models

Insight products
- Industry benchmarks (e.g., “average basket size by sector & region”).
- Scorecards and indices (e.g., SME health indices, risk indices).
- Sector or macro dashboards.
Segment & score licensing
- Behavioral or credit segments sold to partners.
- Propensity or risk scores integrated into others’ systems.
Decision APIs
- Real-time eligibility checks, pricing recommendations, risk classification.
Raw / enriched data feeds (DaaS)
- Anonymized transaction panels.
- Enriched corporate and ownership datasets.
- Alternative data: logistics, mobility, IoT.

Acceldata notes that demand for DaaP has grown as more companies seek to package and sell curated datasets and insights as new revenue streams. Acceldata

But monetization only works if:

Data quality is high and consistent.
Lineage and compliance are clear.
Access and usage are governed and audited.

Otherwise, you’re shipping risk, not value.

7. Common pitfalls and failure modes

Based on patterns emerging in the industry (and in some of the sources we’ve mentioned), here are a few red flags:

Renaming the old data team without changing behavior
- “We now treat data as a product” but nothing about ownership, SLAs, or lifecycle changes.
- Still ticket-driven, still ad-hoc.
Confusing “data product” with “just a dashboard”
- Dashboards can be part of a data product, but if they’re built on fragile, undocumented queries, they’re not products.
No clear data product owners
- When something breaks, everyone blames everyone else; no one feels responsible.
Ignoring data quality & observability
- Treating quality as a side project, not a product feature.
- Leading to mistrust and “Excel as the real source of truth.”
Over-engineering on day one
- Trying to implement full data mesh, DaaP, and DaaS across the entire organization overnight.
- Instead of starting with 3–5 high-value data products and scaling.
Monetizing before internal maturity
- Trying to sell data to external clients before it’s reliable internally.
- This can damage reputation and regulatory standing very quickly.

8. A practical roadmap to DaaP (with DataCore in mind)

Let’s make this concrete from a DataCore + client perspective.

Step 1: Inventory & identify 3–5 candidate data products

Start with questions like:

Where are the biggest recurring questions we answer using data?
Which current datasets:
- have many consumers?
- cause frequent firefighting?
- feed critical decisions or regulatory reports?

Examples of candidate data products:

Customer 360 & Profitability for a bank or fintech
SME Credit Risk Panel combining internal behavior + external data
Retail & Location Intelligence for store or branch expansion
Logistics & Fulfillment Performance for e-commerce or 3PL

DataCore can help here by:

Showing what external datasets we already have (macro, corporate, sector, public).
Helping you map how your internal data could combine with ours to form unique products.

Step 2: Define each data product like a real product

For each candidate:

Users & use cases
- Who will use it? (risk, marketing, branch ops, regulators, partners)
- What decisions will it support?
Value proposition
- What problems does it solve?
- How will success be measured? (faster decisions, fewer write-offs, more revenue, less manual work)
Scope & grain
- At what level does it operate? (customer, loan, transaction, merchant, store)
- What time horizon? (daily snapshots, full history?)
SLAs & quality
- How fresh does it need to be?
- What data quality dimensions are critical (accuracy, completeness, timeliness, consistency)?

This becomes the product spec.

Step 3: Implement on DataCore’s data + HPC platform

This is where DataCore’s positioning is powerful: we combine datasets + compute + infrastructure.

For each product, we can:

Connect & ingest
- Securely onboard your internal data (from on-premise systems, cloud, or hybrid).
- Link it with relevant DataCore datasets (market data, corporate data, macro indicators, public/regulatory datasets).
Model & transform
- Build the pipelines and models (e.g., dbt-style transformations).
- Incorporate quality tests and observability hooks.
Store & serve
- Store in an appropriate warehouse/lakehouse structure.
- Expose via:
  - SQL endpoints
  - APIs (for applications and partners)
  - Dashboards (for business users)
  - Bulk exports, where needed
Govern & secure
- Apply access controls based on user roles and regulatory context.
- Mask or anonymize sensitive fields where required by Vietnamese law or international regulations.
Monitor & iterate
- Set up SLOs, health dashboards, and alerting.
- Track usage and feedback to refine the product.

Step 4: Use HPC & AI to create higher-order data products

Once the foundational data products are in place, DataCore’s HPC and AI infrastructure lets you build even more sophisticated products on top:

Train credit scoring models using Customer 360 + transactional data + macro & sector data, directly on the platform where the data lives.
Develop churn or propensity models for telco or retail, powered by large sample sizes.
Run optimization and simulation workloads (network optimization, inventory, pricing).

These models then become new data products:

A Risk Score API used across underwriting systems.
A Next Best Offer dashboard used by sales and marketing.
A Branch / Store Network Optimization tool for strategy teams.

By running this inside DataCore, you avoid:

Moving sensitive data to multiple external clouds.
Re-implementing infrastructure for training and serving.
Fragmenting governance and compliance.

Step 5: Monetize (carefully) with DaaS

For organizations ready to go further, DataCore can be the platform layer for DaaS:

Host your curated datasets and models securely on DataCore.
Define products, tiers, and pricing (subscriptions, per-API calls, per-volume feeds).
Use DataCore’s APIs, metering, and access control to expose them to partners, clients, or even the wider market (subject to legal and regulatory constraints).

Examples:

A bank offering an SME health index or merchant intelligence to ecosystem partners.
A telco providing mobility and footfall insights to retailers and real-estate developers.
A logistics provider selling supply chain visibility and benchmark data to shippers.

This turns DaaP into a strategic revenue stream, not just an internal efficiency play.

9. How DataCore differentiates (especially in Vietnam)

In the Vietnam context, you already have players like FiinGroup, Vietdata, and global providers offering datasets and analytics. DataCore’s role is to be both:

A national-grade data & HPC platform, and
A co-builder of data products and DaaP operating models.

Key differentiators:

Integrated data + compute
- Many providers sell data or cloud compute; DataCore aims to provide both in one environment, optimized for heavy analytics and AI.
Local regulatory and domain context
- Data residency, compliance with Vietnamese regulations, and understanding of local business practices matter hugely when dealing with financial, corporate, and public data.
Ecosystem positioning
- By partnering with banks, telecoms, government agencies, and universities, DataCore can help create shared data products (e.g., SME insights, sector dashboards) that no single institution could build alone.
Product thinking baked in
- Rather than just selling storage or raw feeds, DataCore can guide clients to define, own, and operate real data products with clear business outcomes.

10. Summary: from raw inputs to an ecosystem of data products

Let’s wrap up the key ideas.

Data products are the units of value: curated tables, dashboards, models, and APIs that solve specific problems and have clear properties (discoverable, addressable, trustworthy, self-describing, interoperable, secure).
Data as a Product (DaaP) is an operating model that applies product management discipline to datasets, encompassing ownership, roadmaps, SLAs, contracts, and lifecycle management. It grew out of data mesh and has been adopted by leaders like Netflix, dbt Labs, and many modern data teams.
Data as a Service (DaaS) is the delivery and monetization layer: exposing those productized datasets to internal and external consumers via APIs and feeds, often with commercial models attached.
SaaS thinking gives us a blueprint: roadmaps, versioning, SLAs, user research, and success metrics, all of which can and should be applied to data.
DataCore’s role is to help organizations in Vietnam and Southeast Asia:
- Discover and define their most valuable data products
- Host, govern, and scale them on a secure data + HPC platform
- Build advanced AI and analytics products on top
- And, when appropriate, turn them into revenue-generating DaaS offerings.

If your company is saying “we want to treat data like a product,” the next step is to get specific:

Which 3–5 datasets should become data products first?
Who will own them?
What platform will they live on?
How will we measure their success?

That’s exactly where DataCore can sit alongside you, not just as a vendor, but as a data product and infrastructure partner.

https://www.womeninanalytics.com/podcast-episodes/ep18 ↩︎