Luxury Recommendation Intelligence

Vector Search, Graph Structures and Distributed Product Knowledge

Case Type
Industrial Case / Luxury Operations / Recommendation Systems / Vector Databases / SLM Architecture / Enterprise Data

1. Context: Recommendation as Distributed Intelligence, Not a Widget

In a high-end luxury group, a “recommendation engine” cannot be treated as a small add‑on inside SAP, Salesforce or any single application.

The problem is not merely to “search for products” or to show “people who bought X also bought Y”. Recommendation in this context means connecting, in real time and with high precision:

product and reference
client and household
history and heritage
events attended and invitations declined
availability and scarcity
boutique, region and channel
style, aesthetics and cultural context
repair and after‑sales history
collection logic and design narratives
technical documentation and internal training material
demand and allocation strategy
dispersed, weak signals from many systems

No single transactional system (SAP, Salesforce, PLM, PIM, CRM, event platform, inventory tool) is designed to hold all of that in a manageable shape. Even with PIM/PLM and strong ERP/CRM cores, there are hard limits to how much semantic richness and cross‑domain linkage you can encode in relational schemas without creating an unmaintainable monster.

The architectural question therefore becomes:

How do we build a recommendation and discovery capability that can “see” across all these systems, without trying to centralize everything physically into one mega‑database?

The answer in this case was to adopt a hybrid architecture using:

existing enterprise systems as sources of truth
linking identifiers (IDs, hashes, tokens) instead of brute‑force replication
a search layer (e.g. Elasticsearch) for classical retrieval
a vector layer (vector database) for semantic similarity and intent
a graph/knowledge layer for relationships and context
and a set of small, specialized language models (SLMs), rather than one giant LLM, to orchestrate recommendation logic and data navigation

2. Problem Statement: Limits of Centralized Enterprise Systems

The starting situation was typical of a mature luxury group:

SAP (or equivalent ERP) managing orders, inventory, pricing, logistics.
Salesforce (or equivalent CRM) managing contacts, opportunities, campaigns, some clienteling.
PLM/PIM managing product master data, variants, technical attributes, media references.
DAM and other repositories for images, videos, collection books, training decks.
Event systems for high‑end events, private previews, ghost boutiques.
Boutique and POS systems, sometimes heterogeneous across regions and Maisons.
Repair and after‑sales systems with long lifecycle histories.
Internal documentation: Confluence, internal wikis, SharePoint, design docs.
Web and social content, sometimes mirrored or partially indexed internally.

From an architecture perspective:

Each system did its job reasonably well.
Integrations existed, but were focused on operational flows (orders, stock, CRM sync, etc.).
A “single central system” that could absorb everything was neither realistic nor desirable.

The group wanted:

Better client‑specific recommendations across channels (boutique, call center, events, digital).
Better internal recommendations for sales associates and client advisors (what to propose, what to study, which story to tell).
Better linkage between events and demand (which concepts resonated with which clients and segments).
A path toward AI‑augmented discovery (for internal users and, selectively, for clients).

Traditional recommendation approaches (simple collaborative filtering on transactional data, rules inside CRM, search‑only experiences) were visibly insufficient.

3. Logical Architecture: Layers and Roles

The proposed architecture was expressed in six logical layers, each with a distinct role.

3.1 Source Systems (Truth Holders)

Keep existing systems as sources of truth:

SAP / ERP for orders, inventory, pricing, logistics.
Salesforce / CRM for contacts, deals, campaigns, some preferences.
PLM / PIM for structured product attributes and hierarchies.
DAM for media and rich content.
Event platforms for invitations, attendance, journeys.
Boutique / POS systems for in‑store transactions and client interactions.
Repair / service systems for after‑sales history.
Internal documentation systems (Confluence, etc.) for product and training knowledge.

These systems are not replaced; they are exposed.

3.2 Linking Identifiers (Glue Without Monoliths)

Instead of creating a huge physical datamesh, the architecture relies on identifiers that allow traversal:

product IDs and SKUs
collection IDs and capsule identifiers
customer IDs, households, client hashes
serial numbers, movement and case identifiers
event IDs and invitation IDs
document IDs and metadata keys
region, boutique and channel identifiers

Basic principle:

Don’t copy all data everywhere.
Maintain consistent keys and mapping tables so you can jump from one system to another when needed.
Only replicate what is needed for specific downstream tasks (e.g. embeddings, search indices).

3.3 Search / Retrieval Layer (Elasticsearch or Equivalent)

A classic search layer continues to play an important role:

Full‑text search over product descriptions, collection texts, FAQs, training docs.
Faceted search over structured attributes (material, color, complication, size, price band, region availability).
Fast operational lookups (product codes, client names, document titles).

This layer is designed for:

Operational robustness and speed.
Structured and text search, not deep semantic similarity.
Use by sales, client advisors, support teams, internal staff.

3.4 Vector Layer (Semantic Retrieval and Similarity)

A vector database is added as a separate layer to capture semantics that traditional search cannot:

Embeddings for products, derived from descriptions, collection narratives, internal notes, reviews, and even images when feasible.
Embeddings for documents: technical documentation, collection books, design stories, training material.
Embeddings for clients: aggregated taste profiles, style preferences, event attendance, interaction histories (in a privacy‑compliant way).
Embeddings for events and experiences themselves: textual descriptions, recaps, content presented.

This layer supports:

“Find pieces that match this style and story, even if they don’t share explicit attributes.”
“Find training documents that explain a concept relevant to this watch.”
“Suggest similar products across Maisons that share an aesthetic or narrative lineage.”
“Help an advisor discover less obvious, but semantically close, recommendations.”

Technically:

Embeddings are computed offline or near‑real‑time by SLM‑based pipelines or specialized embedding models.
Only necessary fields are embedded; raw data stays in the sources.
Each vector entry carries the identifiers that allow resolution back to ERP/CRM/PLM/DAM/etc.

3.5 Graph / Knowledge Layer (Relationship Backbone)

On top of identifiers and vectors, a graph layer organizes relationships:

client → purchases → pieces → collections
client → events attended → pieces presented → reactions
piece → collection → designer → thematic inspirations
piece → materials → manufacturing locations → constraints
piece → boutique / region availability → stock and allocation
piece → repair history → components replaced → documentation
event → product set → invitees → actual attendants
authenticity / piece identity → digital fingerprint → lifecycle events

The graph enables:

Knowledge‑based recommendations (“clients who attended this art event may appreciate pieces from this collection”).
Contextual routing (“if a piece is in this boutique and the client is in that city, propose local options first”).
Constraint reasoning (“this piece cannot be proposed because it’s allocated to another client or under embargo”).

The graph does not replicate all source data. It holds nodes and edges with identifiers and selected attributes, enough to traverse and infer.

3.6 Recommendation Engine (Hybrid Core)

The recommendation engine becomes a hybrid decision layer that combines:

Semantic similarity from the vector layer.
Classic signals: collaborative filtering (“clients with similar patterns”), popularity, recency.
Graph relationships (collections, events, cultural affinities, shared narratives).
Business rules: allocation, embargoes, margin targets, market constraints, segmentation.
Availability and scarcity: what can actually be proposed.
Context: channel (boutique, call center, event, digital), language, cultural profile.
Permissions and privacy: what a given advisor or system is allowed to see and propose.

This engine is not a single algorithm; it is a composition of:

scoring components (similarity, business value, relevance)
rule engines
SLM‑based “orchestrators” that interpret context and decide which signals to consider

4. SLM vs Big LLM: Why “Just Do a Giant Vector Project” Was Rejected

A central question was whether to:

Invest in a large, centralized embedding + vector project with a big LLM front‑end.
Or design a more modular, SLM‑oriented architecture with federated access to data and a more deliberate data modernization effort.

4.1 What Was Tested

The team ran a series of realistic tests:

Closed models (no internet access) were given access to PIM, DAM, PLM, Confluence, internal web content, some social content, etc.
These models were asked to:
- infer or reconstruct a conceptual schema of the product and client knowledge domain;
- answer batteries of expert-level questions (designed by product, sales, heritage and technical teams);
- navigate between documents and systems to produce coherent, reliable answers.

The stakes were high: the expectation was not “fun demos”, but operationally useful intelligence under realistic constraints.

4.2 What Happened

The results were “penosos” — very poor — for the intended level of usage:

The models could produce plausible answers, but not reliably precise or consistent enough for high‑end client‑facing use.
They struggled with the specific internal structure of the data (PIM/PLM fields, internal naming, legacy documentation styles).
They often hallucinated relationships or missed critical details that matter in luxury (exact variant, history, allocation status).
The lack of consistent structure and linking in the existing data limited what pure embedding + LLM could achieve.

In short:

With the current state of data organization, and even in the near future, a large, centralized embedding project would not produce the architectural simplification or recommendation quality the group needed.

The issue was not the power of LLMs in abstract; it was the misalignment between their generic capabilities and the specific, high‑precision, highly structured needs of luxury operations.

4.3 Alternative: Federated SLM + Structured Modernization

The alternative path chosen was:

Use smaller, task‑focused language models (SLMs) that:
- understand the specific schemas and conventions of PIM, PLM, CRM, etc.;
- operate within a clearly defined slice of the problem (e.g. document classification, attribute enrichment, recommendation candidate generation);
- are integrated as components in pipelines, not as a single “oracle”.
Invest in structured data modernization:
- clarifying key identifiers and links between systems;
- cleaning and enriching product and collection metadata where it matters most;
- structuring training and story content so it can be embedded and linked more effectively.
Use lakehouses and curated tables as intermediate layers, where SLMs can operate with stronger guarantees and simpler schemas.

This approach was more in line with how enterprises are starting to use SLMs:

As specialized agents that cooperate with humans and existing systems.
As tools to accelerate data cleaning, classification and linking, rather than as magical inference engines.

4.4 How It Was Validated

Instead of building a full MVP, the team:

Set up technical and operational challenges (e.g. “can we answer these 50 complex questions reliably from this set of systems if we use this architecture?”).
Put architects, business people and developers in the same loop, together with SLM‑based components, to iteratively improve representation and retrieval.
Measured not just answer quality, but explainability, traceability and integration cost.

These experiments were enough to build a solid, realistic business case:

A big “embedding everything into one vector black hole” project would be high cost, high risk and low predictability.
A federated SLM + hybrid architecture could deliver more tangible, incremental value, while improving the underlying data landscape in a way that benefits the whole enterprise.

5. Technical Deep Dive: How the Pieces Work Together

5.1 Data Flow for a Recommendation Scenario

Imagine a boutique advisor with a client in front of them, or a back‑office team designing a curated list for an event.

Context capture
- Client ID, region, language.
- Channel (boutique, event, digital, after‑sales).
- Occasion (service visit, new purchase, special invitation).
- Any explicit request (“something like this watch but dressier”).
Context resolution
- Retrieve the client’s profile from CRM (Salesforce) and relevant history (ERP/repair).
- Fetch relevant node(s) from the graph: client → previous purchases, events, behaviors.
- Pull relevant vector representations: client embedding, embeddings of pieces they liked, collections they responded to.
Candidate generation
- Use vector similarity to find pieces semantically related to:
  - the current product of interest;
  - the client embedding;
  - event themes;
- Filter candidates by basic constraints: region, availability, allocation, brand boundaries.
Graph and rules refinement
- Use graph relationships to:
  - incorporate collection logic (coherence, storytelling).
  - incorporate scarcity and strategic push/pull choices.
  - avoid collisions with pieces already allocated or under embargo.
- Apply business rules (pricing ranges, diversification, Maison policies).
SLM explanation and ranking
- Use SLM components to:
  - score candidates based on narrative match to the context (e.g. “this piece aligns with their history and the event theme”).
  - generate short rationales which can help advisors (why this piece, what story to tell).
Final recommendation set
- Present a small, curated list to the advisor or event planner.
- Ensure each item can be traced back to source data and constraints.
Feedback loop
- Capture explicit and implicit responses.
- Feed those back into embeddings, graph relationships and rules.

5.2 Technical Dependencies and Constraints

Latency must be compatible with live interactions.
- Pre‑computations and caching are critical, especially for vector similarity and graph traversals.
Security and privacy are non‑negotiable.
- Client data must obey regional privacy constraints and group policies.
- SLMs and embeddings must run in controlled environments (no external calls, no uncontrolled data leakage).
Explainability is important in luxury.
- Advisors want to understand why a recommendation is being made, not blindly trust a black box.
- SLMs help generate human‑readable rationales based on the underlying signals.

6. Strategic Value and Positioning Toward AI

From a strategic point of view, this case does several things at once:

It reframes recommendation as a core intelligence capability, not a minor feature.
It clarifies that hybrid architectures (search + vectors + graph + rules + SLMs + enterprise data) are better suited to luxury than monolithic “LLM + vector” fantasies.
It forces the organization to improve data linkage and structure in targeted areas, which benefits many other use‑cases.
It sets a clear, pragmatic path toward AI:
- SLM agents become natural orchestrators and helpers, not unreliable oracles.
- Vector and graph layers become reusable infrastructure for other AI‑driven services (knowledge assistants, training, operations support).

For the client, the output of the engagement was not a “finished recommendation system”, but:

A well‑documented architecture and reasoning.
A set of technical experiments and results.
A business case showing why a big, centralized vector project would be misaligned, and why a federated SLM + hybrid stack would deliver more value and less risk.

7. Why This Case Matters

This case matters because it captures a transition:

From “recommendations as a CRM feature” to “recommendation as distributed enterprise intelligence”.
From “one big LLM and massive embeddings” to “SLM‑oriented, federated architectures aligned with how data and organizations actually work”.
From “data centralization fantasies” to “practical linking, vectorization and graph structuring where it counts”.

In luxury, where precision, narrative, scarcity and context are everything, this approach is not just technically elegant; it is operationally realistic.

It is, in effect, the bridge between:

demand intelligence,
event management,
dynamic inventory and piece identity,
product and client knowledge,
and AI‑ready enterprise architecture.