← All Guides

Manticore Search for KVS

Replace deprecated Sphinx with Manticore for faster search and related videos

1. Why Manticore Instead of Sphinx

SphinxSearch is dead. The project has not been updated in years, the PHP API was last touched 8+ years ago, and Ubuntu still ships version 2.2 from 2016. Sphinx 3.x was never properly supported by KVS.

Manticore Search is a fork of Sphinx that is actively developed, fully open-source, and backward-compatible with Sphinx configuration. It adds:

  • Real-time tables — insert/update/delete without reindexing
  • MySQL protocol — use PDO or any MySQL client instead of the old Sphinx PHP API
  • CALL SUGGEST — built-in typo correction ("Did you mean...?")
  • Better morphology — improved language support including CJK
  • JSON attributes — flexible data storage
  • Query cache — built-in caching of frequent queries
  • Active development — regular releases, documentation, community support

KVS officially recommends Manticore over Sphinx as of 2025.

2. What It Does for KVS

Without Manticore, all search and related video queries hit your MySQL database directly. On high-traffic sites this becomes the bottleneck — MySQL search with LIKE or FULLTEXT is slow and CPU-intensive.

With Manticore:

  • Search becomes 10-100x faster (150-250 queries/sec with 1M+ records)
  • Related videos are calculated by text similarity instead of simple tag matching
  • MySQL load drops significantly — search queries no longer hit the database
  • Morphological search — "running" finds "run", "ran", "runner"
  • Fuzzy matching — handles typos and partial matches

3. Installation

On Ubuntu/Debian:

wget https://repo.manticoresearch.com/manticore-repo.noarch.deb
dpkg -i manticore-repo.noarch.deb
apt update
apt install manticore manticore-extra

On CentOS/RHEL:

yum install https://repo.manticoresearch.com/manticore-repo.noarch.rpm
yum install manticore manticore-extra

Start and enable the service:

systemctl enable manticore
systemctl start manticore

Verify it's running:

mysql -h 127.0.0.1 -P 9306 -e "SHOW STATUS"

You should see a status table. Manticore listens on port 9306 (MySQL protocol) and 9308 (HTTP) by default.

4. Architecture: Plain Index vs RT Tables

This is the key architectural decision. There are two approaches:

Plain Index (Sphinx-compatible)

  • Data is indexed from MySQL in bulk using indexer command
  • Requires cron job to reindex (typically every hour)
  • New content doesn't appear in search until next reindex
  • Uses the same sphinx.conf format — easy migration from Sphinx
  • Simpler setup, good enough for most sites

RT (Real-Time) Tables

  • Data is inserted/updated/deleted via SQL commands
  • Changes appear instantly
  • Requires a sync script to keep Manticore in sync with MySQL
  • More flexible — can do incremental updates
  • Better for sites with frequent content additions

If you're migrating from Sphinx, start with plain index — it's a drop-in replacement. If you're setting up fresh, consider RT tables.

5. Option A: Plain Index (Sphinx-Compatible)

This approach uses the same manticore.conf format as Sphinx. If you have an existing sphinx.conf, you can use it almost as-is.

Edit /etc/manticoresearch/manticore.conf:

source kvs_videos
{
    type = mysql
    sql_host = localhost
    sql_user = YOUR_DB_USER
    sql_pass = YOUR_DB_PASS
    sql_db = YOUR_DB_NAME
    sql_port = 3306

    sql_query_pre = SET NAMES utf8
    sql_query_pre = SET CHARACTER SET utf8

    sql_query = SELECT v.video_id, v.video_id AS video_id_, \
        v.title, v.description, \
        v.duration, v.video_viewed, \
        IF(v.rating_amount>0, v.rating/v.rating_amount, 0) AS rating, \
        UNIX_TIMESTAMP(v.post_date) AS post_date, \
        v.resolution_type, \
        (SELECT GROUP_CONCAT(CONCAT_WS(',',t.tag,t.synonyms)) \
            FROM ktvs_tags t \
            JOIN ktvs_tags_videos tv ON tv.tag_id=t.tag_id \
            WHERE tv.video_id=v.video_id) AS tags, \
        (SELECT GROUP_CONCAT(CONCAT_WS(',',c.title,c.synonyms)) \
            FROM ktvs_categories c \
            JOIN ktvs_categories_videos cv ON cv.category_id=c.category_id \
            WHERE cv.video_id=v.video_id) AS categories, \
        (SELECT GROUP_CONCAT(CONCAT_WS(',',m.title,m.alias)) \
            FROM ktvs_models m \
            JOIN ktvs_models_videos mv ON mv.model_id=m.model_id \
            WHERE mv.video_id=v.video_id) AS models, \
        (SELECT title FROM ktvs_content_sources \
            WHERE content_source_id=v.content_source_id) AS content_source_title, \
        (SELECT title FROM ktvs_dvds \
            WHERE dvd_id=v.dvd_id) AS dvd_title \
        FROM ktvs_videos v \
        WHERE v.status_id=1 AND v.load_type_id>0

    sql_attr_timestamp = post_date
    sql_attr_bigint = duration
    sql_attr_bigint = video_viewed
    sql_attr_float = rating
    sql_attr_uint = resolution_type
    sql_attr_bigint = video_id_

    sql_attr_multi = uint category_id_general FROM query; \
        SELECT video_id, category_id FROM ktvs_categories_videos
}

index videos
{
    source = kvs_videos
    morphology = stem_en
    min_word_len = 2
    min_infix_len = 2
    index_exact_words = 1
    path = /var/lib/manticore/videos

    charset_table = 0..9, A..Z->a..z, _, a..z, \
        U+410..U+42F->U+430..U+44F, U+430..U+44F
}

searchd
{
    listen = 127.0.0.1:9306:mysql
    listen = 127.0.0.1:9308:http
    log = /var/log/manticore/searchd.log
    pid_file = /var/run/manticore/searchd.pid
    data_dir = /var/lib/manticore
}

Build the index and test:

indexer --all
systemctl restart manticore
indexer --rotate --all

Add cron for hourly reindex:

# crontab -e
0 * * * * /usr/bin/indexer --rotate --all > /dev/null 2>&1

Multi-language support: If you have translated titles/descriptions, modify the sql_query to include them:

CONCAT(v.title, ' ', COALESCE(v.title_de,''), ' ', COALESCE(v.title_es,'')) AS title,
CONCAT(v.description, ' ', COALESCE(v.description_de,''), ' ', COALESCE(v.description_es,'')) AS description,

6. Option B: Real-Time Tables (Recommended)

RT tables give you instant updates without cron-based reindexing. The trade-off is you need a sync script.

First, keep manticore.conf minimal — just the searchd section:

searchd
{
    listen = 127.0.0.1:9306:mysql
    listen = 127.0.0.1:9308:http
    data_dir = /var/lib/manticore
    log = /var/log/manticore/searchd.log
    pid_file = /var/run/manticore/searchd.pid
    query_log = /var/log/manticore/query.log
    seamless_rotate = 1
    preopen_tables = 1
    qcache_max_bytes = 256M
    qcache_thresh_msec = 1000
    qcache_ttl_sec = 3600
}

Create the RT table via MySQL protocol:

mysql -h 127.0.0.1 -P 9306 -e "
CREATE TABLE IF NOT EXISTS videos_rt (
    title text indexed stored,
    description text indexed stored,
    categories text indexed stored,
    tags text indexed stored,
    models text indexed stored,
    dvd_title text indexed stored,
    post_date timestamp,
    duration bigint,
    video_viewed bigint,
    rating float,
    category_ids mva,
    title_sort string
) morphology='stem_en' min_infix_len='2' index_exact_words='1'
  charset_table='0..9, A..Z->a..z, _, a..z, U+410..U+42F->U+430..U+44F, U+430..U+44F'
"

Then you need a sync script that reads from MySQL and inserts into Manticore. The script should:

  1. Connect to both MySQL (KVS database) and Manticore (port 9306)
  2. Query all active videos with their tags, categories, models
  3. For incremental updates: only sync videos modified in the last N hours
  4. For full sync: truncate the RT table and re-insert everything (run once daily)
  5. Use REPLACE INTO to handle updates

Run the sync script via cron:

# Incremental every 15 minutes
*/15 * * * * php /path/to/sync_manticore.php
# Full rebuild at midnight
0 0 * * * php /path/to/sync_manticore.php full

The sync script is site-specific — it depends on your database schema, which fields you index, which languages you support, and what filters you need. This is where custom configuration is required.

7. Custom Search Script

KVS expects the search endpoint to return XML in a specific format. Whether you use plain index or RT tables, you need a PHP script that:

  1. Receives query parameters from KVS (query, limit, from, sort_by, etc.)
  2. Queries Manticore via PDO (MySQL protocol on port 9306)
  3. Returns XML in KVS format

Basic example using PDO instead of the old Sphinx API:

<?php
$query = trim($_GET['query'] ?? '');
$limit = max(1, min(100, (int)($_GET['limit'] ?? 20)));
$from = max(0, (int)($_GET['from'] ?? 0));

if ($query === '') {
    header('Content-Type: application/xml; charset=utf-8');
    echo '<search_feed total_count="0" from="0"></search_feed>';
    exit;
}

$pdo = new PDO('mysql:host=127.0.0.1;port=9306', '', '', [
    PDO::ATTR_ERRMODE => PDO::ERRMODE_EXCEPTION,
    PDO::ATTR_DEFAULT_FETCH_MODE => PDO::FETCH_ASSOC,
]);

$escaped = addslashes($query);
$sql = "SELECT id, WEIGHT() as w FROM videos_rt
        WHERE MATCH('$escaped')
        ORDER BY WEIGHT() DESC
        LIMIT $from, $limit
        OPTION field_weights=(title=100, categories=30, tags=20, description=15),
               ranker=expr('sum(lcs*1000+bm25*10+exact_hit*5000)')";

$rows = $pdo->query($sql)->fetchAll();
$meta = $pdo->query('SHOW META')->fetchAll();
$total = 0;
foreach ($meta as $m) {
    if ($m['Variable_name'] === 'total_found') $total = (int)$m['Value'];
}

header('Content-Type: application/xml; charset=utf-8');
echo "<search_feed total_count=\"$total\" from=\"$from\">\n";
foreach ($rows as $r) {
    echo "<gallery>\n";
    echo "  <weight>{$r['w']}</weight>\n";
    echo "  <kvs_data>{$r['id']}</kvs_data>\n";
    echo "</gallery>\n";
}
echo "</search_feed>";

This is a minimal example. A production script should include:

  • Input sanitization and query length limits
  • Sort mode handling (sort_by parameter from KVS)
  • Duration, category, and HD filters
  • Error handling with fallback
  • Slow query logging

8. KVS Integration

Once your search script is working, configure it in KVS:

  1. Go to Admin Panel → Plugins → External Search
  2. Set these values:
  • Use external search: Always
  • Display external search results: Completely replace internal search
  • API call: http://127.0.0.1/admin/manticore/search.php?query=%QUERY%&limit=%LIMIT%&from=%FROM%
  • Outgoing URL: https://yourdomain.com

Important: Use 127.0.0.1 (localhost) for the API call, not your domain name. This avoids going through your web server's full request pipeline and significantly reduces latency. You'll need to configure nginx/Apache to serve the script on localhost port 80.

To enable Manticore for related videos:

  1. Go to Website UI → Pages
  2. Find all "related" blocks (site and in-player)
  3. Change mode_related to: Related by title (external search plugin)

Test search on your site and verify related videos are working.

9. Advanced Features

Typo Correction ("Did You Mean?")

Manticore has built-in CALL SUGGEST that returns spelling suggestions based on your indexed data:

CALL SUGGEST('amatuer', 'videos_rt');
-- Returns: amateur

You can integrate this into your search script to auto-correct queries or show "Did you mean: amateur?" to users. This is a major UX improvement that Sphinx never had.

Custom Ranking

Manticore supports custom ranker expressions. Instead of default BM25, you can fine-tune relevance:

OPTION ranker=expr('sum(
    lcs*1000 +
    bm25*10 +
    word_count*500 +
    exact_hit*5000 +
    if(min_hit_pos<3, 2000, 0)
)')

This prioritizes:

  • exact_hit — exact match in a field (highest boost)
  • min_hit_pos<3 — match appears early in the field (likely in title)
  • lcs — longest common subsequence (phrase matching)
  • word_count — more matching words = better

Field Weights

Not all fields are equal. Title matches should rank higher than description matches:

OPTION field_weights=(title=100, models=90, categories=30, tags=20, description=15)

Adjust these based on your content. If your site is model-focused, increase model weight. If categories are important for your niche, boost them.

Query Cache

Manticore has built-in query caching. Configure in manticore.conf:

qcache_max_bytes = 256M
qcache_thresh_msec = 1000
qcache_ttl_sec = 3600

This caches results for queries that take longer than 1 second, storing up to 256MB of cached results for 1 hour. Popular search terms get served from cache.

10. Performance Tips

  • Use 127.0.0.1 for API calls — don't route through your domain. This avoids nginx, SSL, and Cloudflare overhead.
  • Limit max_matches — set it to a reasonable value (1000-5000) to reduce memory usage per query.
  • Monitor slow queries — enable query_log in manticore.conf and watch for queries taking >500ms.
  • Separate server for high traffic — if your site does 1000+ searches/minute, move Manticore to a separate server. It only needs the MySQL protocol port open.
  • RAM — Manticore loads indexes into RAM. For 100K videos, expect ~200-500MB. For 1M+ videos, plan for 2-4GB dedicated to Manticore.
  • Don't use memcached for PHP sessions — KVS frontend cache can overfill memcached and invalidate sessions. Use a RAM disk instead if you need fast sessions.

11. Migrating from Sphinx

If you already have Sphinx running:

  1. Install Manticore (it can coexist with Sphinx temporarily)
  2. Copy your sphinx.conf to /etc/manticoresearch/manticore.conf — the format is compatible
  3. Stop Sphinx: systemctl stop sphinxsearch
  4. Start Manticore: systemctl start manticore
  5. Rebuild indexes: indexer --rotate --all
  6. Update your search script to use PDO (MySQL protocol) instead of the old SphinxAPI. The old API still works but PDO is cleaner and more reliable.
  7. Test — verify search and related videos work correctly
  8. Remove Sphinx: apt remove sphinxsearch

The migration is straightforward because Manticore is backward-compatible with Sphinx configuration. The biggest improvement comes from switching to PDO and optionally moving to RT tables.