Global Search - Querying Multiple Tables for Search or Activity
One of the projects I am currently working on uses Prisma as an ORM. The project requires a search feature as well as a feature where there is a feed of changed items. As Prisma doesn't have a built in concept for handling these types of scenarios, I've investigated options to come up with an approach that best works for the project.
The main approach discussed here avoids using the ORM itself and is the one that I went forward with but the later section Using an ORM covers some ideas for tackling it within an ORM itself.
Summary of the Approach
- Create a type for the result/feed items (e.g.
SearchResult
/ActivityItem
) - Use raw SQL with
UNION
to query tables for common/like fields in the same operation (table sub queries need matching # columns forUNION
to work) - Include a field in each table sub query to indicate the type
- Load results as the created type(s)
Pros
- Single DB operation
- Paging / ordering is done across tables/models
Cons
- It can be awkward to include data specific to individual tables/models as unapplicable rows will require an empty field in their place
- If an ORM is being used, it may not be supported or require extra effort and code that is inconsistent with the code base
An example using Prisma
This example should work for any case that an ORM isn't used or when the ORM allows for raw queries. Prisma allows for raw SQL queries to be performed using the prisma.$queryRaw()
function - this can be subtitled with any raw query options.
For this example, I'll use an application that has companies, users, and notes (for companies) that are searchable.
+--------------------------+
| Company |
+--------------------------+
| - id (string) |
| - name (string) |
| - createdAt (datetime) |
| - updatedAt (datetime) |
+--------------------------+
+--------------------------+
| User |
+--------------------------+
| - id (string) |
| - firstName (string) |
| - lastName (string) |
| - createdAt (datetime) |
| - updatedAt (datetime) |
+--------------------------+
+--------------------------+
| Note |
+--------------------------+
| - id (string) |
| - title (string) |
| - companyId (string) |
| - createdAt (datetime) |
| - updatedAt (datetime) |
+--------------------------+
-
Create a type for the results
type SearchResult = { entityType: EntityType; entityId: string; title: string; createdAt: Date; updatedAt: string; };
-
Create a search function
/** * Search for companies, users, and notes based on their name/title * @param {string} search The search query to match * @param {number} pageNumber The page to return * @param {number} pageSize The number of results to include per-page * @returns {Promise<SearchResult[]>} A promise for the array of search results */ export default function search( searchQuery: string, pageNumber: number = 0, pageSize: number= 10, ) { return prisma.$queryRaw<SearchResult[]>(Prisma.sql` SELECT 'COMPANY' as entityType, id as entityId, name as title, createdAt, updatedAt FROM Company WHERE name LIKE '%${searchQuery}%' UNION SELECT 'USER' as entityType, id as entityId, CONCAT(firstName, ' ', lastName) as title, createdAt, updatedAt FROM User WHERE firstName LIKE '%${searchQuery}%' OR lastName LIKE '%${searchQuery}%' UNION SELECT 'NOTE' as entityType, id as entityId, title, createdAt, updatedAt FROM Note WHERE title LIKE '%${searchQuery}%' ORDER BY updatedAt DESC LIMIT ${pageSize} OFFSET ${pageSize * pageNumber} `); }
(Note: This example is using
LIKE
with wild cards, for better search results consider using full-text search - see Prisma Docs, MySQL Docs, PostgreSQL Docs) -
Use the search as needed
// Get the search query and page number from the request const searchResults = await search(searchQuery, pageNumber); // Serialize / return the results to the frontend for display
When displaying the data, the entityType
property can be used to indicate the type of result and link to it accordingly.
For paging, you'll likely need the total results in order to determine the number of pages. This can be done with a query such as the following:
prisma.$queryRaw<number>(Prisma.sql`
SELECT (
SELECT COUNT(*)
FROM Company
WHERE name LIKE '%${searchQuery}%'
) + (
SELECT COUNT(*)
FROM User
WHERE
firstName LIKE '%${searchQuery}%'
OR
lastName LIKE '%${searchQuery}%'
) + (
SELECT COUNT(*)
FROM Note
WHERE title LIKE '%${searchQuery}%'
) AS totalResults
`);
Both the total and the results can be retrieved in a single transaction if required, for Prisma this is done by using prisma.$transaction()
:
const [totalResults, searchResults] = await prisma.$transaction([
prisma.$queryRaw<number>(Prisma.sql`/** The count query */`),
prisma.$queryRaw<SearchResult[]>(Prisma.sql`/** The search query */`),
]);
Other Ideas and Considerations
Using a Base Table
If all items being searched have a common base (e.g. Movies, Shows, Songs could be considered "Media"), they can be consolidated in a single table that has the common fields (title, release date, length, etc.) along with a type
field. This would allow for searching the single table, though other considerations would need to be made, depending on the application:
- How to handle item specific fields
- JSON field?
- Separate linked tables?
- What if a item with fewer common fields needs to be added in the future such as a Book?
Using an ORM
When first considering approaches using Prisma, one of the original ideas I had was to search each model using the ORM and paginate that, for example:
async function searchVideos(search: string, page: number, pageSize: number) {
// Query Movies
const [totalMovies, movies] = await prisma.$transaction([
prisma.movie.count({
where: {
name: {
contains: search,
},
},
}),
prisma.movie.findMany({
where: {
name: {
contains: search,
},
},
skip: page * pageSize,
take: pageSize,
}),
]);
// Query Shows
const [totalShows, shows] = await prisma.$transaction([
prisma.shows.count({
where: {
name: {
contains: search,
},
},
}),
prisma.shows.findMany({
where: {
name: {
contains: search,
},
},
skip: page * pageSize,
take: pageSize,
}),
]);
return {
movies,
shows,
totalMovies,
totalShows
}
}
(Note: This specific example using Movies
and Shows
, is a good candidate to use a more generic Video
or Media
type/table as described in the above Using a Base Table)
This works ok if results are being shown in separate groups where paging is being done per-type, e.g. a quick search grouped by type such as:
Search Results:
Movies:
- Movie Match 1
- Movie Match 2
- Movie Match 3
TV Shows:
- TV Show Match 1
If results are shown in one list and you'd like to have a consistent total page size you will need to handle manipulating the page size - dividing the page size by the number of types of items and querying more from other types when reaching the end of results for any of the types.
Another issue comes with sorting as taking equal chunks from each table does not guarantee ordering across models. For example, if you take the first 5 Movies and first 5 Shows ordered ascending by date, it is possible that the 6th movie has an earlier date than one of the Shows but is excluded since sorting is done per model.
I did end up using this approach for a quick search feature since results are being grouped by type (i.e. show the latest 5 of each type that match) but chose to go with raw queries for the full search and activity feed.
Indexed Search
Rather than querying the tables directly, the items in them could be indexed externally for searching. This could be a separate table that is populated as data changes or using a provider such as algolia. This will help with performance with the tradeoff of maintaining the index.