Stay organized with collections
Save and categorize content based on your preferences.
Suppose you have an embedding model. Given a user, how would you
decide which items to recommend?
At serve time, given a query, you start by doing one of the following:
For a matrix factorization model, the query (or user) embedding is known
statically, and the system can simply look it up from the
user embedding matrix.
For a DNN model, the system computes the query embedding \(\psi(x)\)
at serve time by running the network on the feature vector \(x\).
Once you have the query embedding \(q\), search for item embeddings
\(V_j\) that are close to \(q\) in the embedding space.
This is a nearest neighbor problem. For example, you can return the top k
items according to the similarity score \(s(q, V_j)\).
You can use a similar approach in related-item recommendations. For example,
when the user is watching a YouTube video, the system can first look up the
embedding of that item, and then look for embeddings of other items
\(V_j\) that are close in the embedding space.
Large-scale retrieval
To compute the nearest neighbors in the embedding space, the system
can exhaustively score every potential candidate. Exhaustive scoring
can be expensive for very large corpora, but you can use either of
the following strategies to make it more efficient:
If the query embedding is known statically, the system can perform
exhaustive scoring offline, precomputing and storing a list of the
top candidates for each query. This is a common practice for
related-item recommendation.
Use approximate nearest neighbors.
Google provides an open-source tool on GitHub called
ScaNN
(Scalable Nearest Neighbors). This tool performs efficient vector
similarity search at scale.
[[["Easy to understand","easyToUnderstand","thumb-up"],["Solved my problem","solvedMyProblem","thumb-up"],["Other","otherUp","thumb-up"]],[["Missing the information I need","missingTheInformationINeed","thumb-down"],["Too complicated / too many steps","tooComplicatedTooManySteps","thumb-down"],["Out of date","outOfDate","thumb-down"],["Samples / code issue","samplesCodeIssue","thumb-down"],["Other","otherDown","thumb-down"]],["Last updated 2025-08-25 UTC."],[[["\u003cp\u003eRecommender systems leverage embedding models to identify items similar to a user's preferences or a given item.\u003c/p\u003e\n"],["\u003cp\u003eThe system finds relevant items by searching for embeddings that are close to the user or item embedding in the embedding space, effectively solving a nearest neighbor problem.\u003c/p\u003e\n"],["\u003cp\u003eFor large-scale retrieval, efficiency can be improved by precomputing top candidates or using approximate nearest neighbor search techniques like ScaNN.\u003c/p\u003e\n"]]],[],null,["# Retrieval\n\n**Suppose you have an embedding model. Given a user, how would you\ndecide which items to recommend?**\n\nAt serve time, given a query, you start by doing one of the following:\n\n- For a matrix factorization model, the query (or user) embedding is known statically, and the system can simply look it up from the user embedding matrix.\n- For a DNN model, the system computes the query embedding \\\\(\\\\psi(x)\\\\) at serve time by running the network on the feature vector \\\\(x\\\\).\n\nOnce you have the query embedding \\\\(q\\\\), search for item embeddings\n\\\\(V_j\\\\) that are close to \\\\(q\\\\) in the embedding space.\nThis is a nearest neighbor problem. For example, you can return the top k\nitems according to the similarity score \\\\(s(q, V_j)\\\\).\n\nYou can use a similar approach in related-item recommendations. For example,\nwhen the user is watching a YouTube video, the system can first look up the\nembedding of that item, and then look for embeddings of other items\n\\\\(V_j\\\\) that are close in the embedding space.\n\nLarge-scale retrieval\n---------------------\n\nTo compute the nearest neighbors in the embedding space, the system\ncan exhaustively score every potential candidate. Exhaustive scoring\ncan be expensive for very large corpora, but you can use either of\nthe following strategies to make it more efficient:\n\n- If the query embedding is known statically, the system can perform exhaustive scoring offline, precomputing and storing a list of the top candidates for each query. This is a common practice for related-item recommendation.\n- Use approximate nearest neighbors. Google provides an open-source tool on GitHub called [ScaNN](https://github.com/google-research/google-research/tree/master/scann) (Scalable Nearest Neighbors). This tool performs efficient vector similarity search at scale."]]