A Practical AI Pipeline for Turning Media Into Searchable Catalogs

Greg_Magarshak · January 9, 2026, 7:49pm

A Practical AI Pipeline for Turning Media Into Searchable Catalogs

This post describes a production-ready pipeline for converting images, audio, video, and text into searchable, faceted catalogs using AI — without vector databases, embeddings, or opaque ranking systems.

The core idea is simple:

Use AI to project content into explicit, bounded attributes and keywords, then index those using normal relational + graph structures.

1. Ingest Any Media

The pipeline accepts:

Images (photos, posters, ads, menus)
Audio (podcasts, interviews, calls)
Video (shows, talks, lectures)
Text (documents, transcripts, summaries)

Each media item is stored once, unchanged.

2. Extract Structured Observations (One AI Call)

For each item, a single AI call is made to extract observations, in a structured way.

Examples of observations:

Title / subtitle
Language and spelling quality
Countries and cultural relevance
Safety signals (obscene, controversial)
Keywords (English + native language)
Holiday or event references
Date ranges and importance
Shareability and confidence

Rules enforced:

Strict JSON output
Fixed fields only
Hard length limits
Use null if uncertain
No guessing

This keeps output predictable and auditable. A simple policy ensures user submissions are vetted
for importance and relevance, and screened for obscenity or controversy before becoming
discoverable and searchable by others.

English	Hebrew
English1024×1536 280 KB	Hebrew1024×1536 198 KB

3. Normalize + Compact Attributes

After AI output:

Numbers are bucketed (e.g. confidence → 0.95)
Text fields are truncated
Keywords are normalized and deduplicated
Arrays are hard-limited in size
Total attribute size is capped (e.g. 1 KB)

Result:

A small, flat attribute map per item
Safe to store directly on the stream

4. Index Using Relations (No Vector Search)

Each attribute becomes a relation, for example:

attribute/keyword=valentine
attribute/country=IL
attribute/holiday=Valentine’s Day
attribute/confidence=0.95

Key properties:

Each category is its own index
Prefix search works naturally
Facets combine efficiently
Storage stays relational
No embeddings or vector DBs required

Search becomes:

“Find items with keyword romance, country IL, confidence ≥ 0.8”

5. Audio & Video: Time-Based Semantic Indexing

For long media (audio/video):

Transcribe with timestamps (speaker-aware if available)
Split evenly across the timeline (not by meaning)
For each chunk:
- Generate summary
- Generate search keywords
- Preserve time range
Store clips as their own streams
Index clips and episodes separately

This enables:

Jump-to-topic search
Clip-level discovery
Episode-level aggregation
Explainable results (“this matched at 00:42:10”)

6. AI Keywords, Not User Keywords

Keywords are:

Generated by AI
Normalized and bounded
Expanded (carefully) for search recall

This avoids:

User-supplied tag spam
Missing synonyms
Language mismatch

The system can later re-index without re-processing media.

7. What This Enables

With the same pipeline you can build:

Image catalogs (e-commerce, menus, ads)
Dating profile discovery
Restaurant menus with semantic search
Podcast / interview archives
Lecture and education libraries
Community photo archives
Media monitoring tools
Historical video libraries
Multilingual search experiences
Faceted browsing without embeddings

8. Why This Works

AI is used as a semantic compiler, not a chatbot
Output is bounded, structured, and auditable
Indexing is explicit and explainable
Search is fast, cheap, and debuggable
No vendor lock-in to vector databases
Works today with standard SQL + relations

Summary

This pipeline turns unstructured media into search-native data by:

Extracting bounded semantics with AI
Normalizing into small attributes
Indexing via relations
Enabling keyword + facet search at scale

It’s not a recommendation engine.
It’s a catalog intelligence system.

And it works for images, audio, video, and text using the same architecture.

Videos

For real-time videoconferencing, we can use the speech recognition built into browsers, and achieve speaker diarization without having to guess voices, because we know who is speaking in the teleconference or live show being recorded:

Otherwise, we hook into external APIs provided by companies like AssemblyAI, to do transcription jobs with speaker diarization.

From these, we extract keywords, summaries, metadata etc. and make it all easily searchable and shareable. The end result for a multi-year podcast can be something like this:

Images

For images, a typical ingestion script works like this, using our technology. You can take a peek at the code:

<?php

function GroupsAPI_upload_post($params)
{
	// Users::loggedInUser(true);

	if (empty($_FILES['image']['tmp_name'])) {
		throw new Q_Exception("No image uploaded");
	}

	$binary = file_get_contents($_FILES['image']['tmp_name']);

	// 1. Save image (folder-only icon semantics)
	$tempKey = 'tmp_' . uniqid('', true);
	$paths = Q_Image::save(array(
		'data' => $binary,
		'path' => 'Q/uploads',
		'subpath' => "Streams/images/$tempKey",
		'save' => 'Streams/image',
		'skipAccess' => true
	));

	if (empty($paths[''])) {
		throw new Exception("Image save failed");
	}

	$tempIconPath = $paths['']; // folder-only path

	// 2. Run LLM observations
	$inputs = array('images' => array($binary));
	$observations = array(
		'semanticExtraction' => array(
			'promptClause' =>
				'First, extract explicit semantic facts visible in the image. ' .
				'Do not infer or invent text. ' .
				'If something is not clearly present, return null. ' .
				'Assume the current year is ' . date('Y') . '. ' .
				'If a known holiday is clearly referenced, identify it and give its date range ' .
				'for this year only. Use YYYY-MM-DD format.',
			'fieldNames' => array(
				'title',
				'subtitle',
				'holidayName',
				'startDate',
				'endDate'
			),
			'example' => array(
				'title'    => 'Happy Valentine\'s Day',
				'subtitle' => 'Celebrate love with something sweet',
				'holidayName'       => 'Valentine\'s Day',
				'startDate'         => '2026-02-14',
				'endDate'           => '2026-02-14'
			)
		),

		/* === NEW, ADDITIVE === */
		'holidayAnalysis' => array(
			'promptClause' =>
				'If holidayName is present, evaluate the global importance of this holiday. ' .
				'Return an integer from 1 to 10 representing importance to at least 1 million people worldwide. ' .
				'10 = globally significant holidays (e.g. Christmas, Ramadan). ' .
				'7-9 = widely observed national or religious holidays (e.g. Valentine\'s Day, Hanukkah). ' .
				'4-6 = regional or multi-country holidays. ' .
				'1-3 = minor or niche holidays. ' .
				'If no real holiday is present, return null. ' .
				'Base this on widely accepted real-world observance, not personal opinion.',
			'fieldNames' => array(
				'holidayImportance'
			),
			'example' => array(
				'holidayImportance' => 9
			)
		),
		/* === END NEW === */

		'languageQuality' => array(
			'promptClause' =>
				'Analyze the primaryImage and determine the primary language used, spelling quality, and expression naturalness. ' .
				'Base this judgment only on visible text and widely-known linguistic conventions. ' .
				'Do not infer intent or audience.',
			'fieldNames' => array(
				'language',
				'spelling',
				'expressions'
			),
			'example' => array(
				'language'    => 'ru',
				'spelling'    => 9,
				'expressions' => 8
			)
		),

		'culturalRelevance' => array(
			'promptClause' =>
				'Determine which countries and cultures this content is most relevant to based on visible imagery, symbols, and text. ' .
				'Do not guess. Only include countries that are strongly implied.',
			'fieldNames' => array(
				'countries',
				'culturalSpecificity'
			),
			'example' => array(
				'countries'           => array('RU','UA'),
				'culturalSpecificity' => 7
			)
		),

		'timing' => array(
			'promptClause' =>
				'Infer when this content would be relevant in the real world. ' .
				'Provide exact date ranges for the years 2025 and 2026 only. ' .
				'Use real holidays or events if clearly implied. ' .
				'Do not invent holidays. If no timing is evident, return an empty array.',
			'fieldNames' => array(
				'dates',
				'evergreen'
			),
			'example' => array(
				'dates' => array(
					array('2025-01-07','2025-01-07'),
					array('2026-01-07','2026-01-07')
				),
				'evergreen' => 5
			)
		),

		'contentClassification' => array(
			'promptClause' =>
				'Classify what this content is and how it is presented. ' .
				'Describe type, occasion, tone, and sentiment without judging quality.',
			'fieldNames' => array(
				'contentType',
				'occasion',
				'tone',
				'sentiment'
			),
			'example' => array(
				'contentType' => 'greeting',
				'occasion'    => array('orthodoxChristmas'),
				'tone'        => array('festive'),
				'sentiment'   => 'positive'
			)
		),

		'safety' => array(
			'promptClause' =>
				'Assess whether the content contains obscenity or material likely to cause controversy. ' .
				'Rate visibility of such elements, not intent.',
			'fieldNames' => array(
				'obscene',
				'controversial'
			),
			'example' => array(
				'obscene'     => 1,
				'controversial' => 1
			)
		),

		'discoveryQuality' => array(
			'promptClause' =>
				'Evaluate how suitable this content is for discovery and sharing by others. ' .
				'Besides this, derive most common 10 keywords as single English words ' .
				'people might use to search for this. Normalize to lowercase alphanumeric only. ' .
				'If the primary language is English, set keywordsNative to empty array. ' .
				'Otherwise, fill keywordsNative with top 10 keywords as single words in that language.',
			'fieldNames' => array(
				'shareability',
				'confidence',
				'keywords', 
				'keywordsNative'
			),
			'example' => array(
				'keywords' => array("Orthodox", "Christmas", "Religious", "Holiday"),
				'keywordsNative' => array("Православный", "Рождество", "Религиозный", "Праздник"),
				'shareability' => 8,
				'confidence'   => 0.9
			)
		)
	);
	$llm = AI_LLM::create('openai');
	$results = $llm->process($inputs, $observations);

	// 3. Policy gate
	if (!GroupsAPI_Images::accept($results)) {
		return Q_Response::json(array(
			'accepted' => false,
			'results' => $results
		));
	}

	// 4. Compact for attributes
	$attr = GroupsAPI_Images::forAttributes($results);

	if (!GroupsAPI_Images::fitsAttributes($attr)) {
		throw new Exception("Observation attributes exceed limit");
	}

	// 5. Create Streams/image
	$title =  Q::ifset($attr, 'title', Q::ifset($attr, 'subtitle', 'Untitled Image'));
	unset($attr['title']);
	$stream = Streams::create(
		'GroupsAPI',
		'GroupsAPI',
		'Streams/image',
		array(
			'title' => $title,
			'icon' => $paths[''], // folder only
			'attributes' => $attr
		),
		array(
			'skipAccess' => true
		)
	);

	/*
	 * 6. Rename temp folder → final stream name
	 */
	$uploadsRoot = APP_WEB_DIR;
	$timestamp = time();
	$finalDir = implode(DS, array(
		$uploadsRoot, 'Q', 'uploads', 'Streams', $stream->publisherId, $stream->name, 'icon', $timestamp
	));

	if (!is_dir($parentDir = dirname($finalDir))) {
		@mkdir($parentDir, 0777, true);
	}

	if (!rename($uploadsRoot . DS . $tempIconPath, $finalDir)) {
		throw new Exception("Failed to finalize image storage");
	}


	/*
	 * 7. Update stream icon to final folder
	 */
	$publisherId = $stream->publisherId;
	$streamName = $stream->name;
	$stream->update(array(
		'icon' => "{{baseUrl}}/Q/uploads/Streams/$publisherId/$streamName/icon/$timestamp"
	), array(
		'skipAccess' => true
	));

	return Q_Response::json(array(
		'accepted' => true,
		'streamName' => $stream->name,
		'icon' => Streams::iconUrl($stream)
	));
}

Intercoin

A Practical AI Pipeline for Turning Media Into Searchable Catalogs

A Practical AI Pipeline for Turning Media Into Searchable Catalogs

1. Ingest Any Media

2. Extract Structured Observations (One AI Call)

3. Normalize + Compact Attributes

4. Index Using Relations (No Vector Search)

5. Audio & Video: Time-Based Semantic Indexing

6. AI Keywords, Not User Keywords

7. What This Enables

8. Why This Works

Summary

Videos

Images

Disclaimer