Skip to content

Workers Overview

Class: ` **Location**:models/workers/backlink_statistics/main.py` Purpose:

Configuration: - backlink_stats_enabled: True - backlink_stats_schedule: "0 2 * * *" # Daily at 2 AM - backlink_stats_top_limit: 100

Health Checks: Available via worker health endpoint

Dev Worker

Class: ` **Location**:models/workers/dev/main.py` Purpose:

Health Checks: Available via worker health endpoint

Elasticsearch Indexer Worker

Class: ` **Location**:models/workers/elasticsearch_indexer/main.py` Purpose:

Health Checks: Available via worker health endpoint

Entity Diff Worker

Class: ` **Location**:models/workers/entity_diff/rdf_serializer.py` Purpose:

Health Checks: Available via worker health endpoint

General Stats Worker

Class: ` **Location**:models/workers/general_stats/main.py` Purpose:

Health Checks: Available via worker health endpoint

Id Generation Worker

Class: IdGeneratorWorker Location: models/workers/id_generation/id_generation_worker.py Purpose: Asynchronous worker service for generating Wikibase entity IDs using range-based allocation. This worker reserves blocks (ranges) of IDs from the database to minimize contention during high-volume entity creation. It monitors range status, handles graceful shutdown, and provides health checks for monitoring. The worker initializes Vitess and Enumeration services, then runs a continuous loop checking ID range availability. IDs are allocated from pre-reserved ranges to ensure efficient, low-latency ID generation.

Configuration: - WORKER_ID: Unique worker identifier (default: auto-generated)

Health Checks: Available via worker health endpoint

Incremental Rdf Worker

Class: IncrementalRDFWorker Location: models/workers/incremental_rdf/incremental_rdf_worker.py Purpose: Worker that consumes entity change events and generates incremental RDF diffs. This worker: 1. Consumes entity change events from entitybase.entity_change Kafka topic 2. Looks up revision metadata in MySQL to get content hashes 3. Fetches entity snapshots from S3 for both old and new revisions 4. Computes RDF diffs using IncrementalRDFUpdater 5. Publishes RDF change events to incremental_rdf_diff Kafka topic

Health Checks: Available via worker health endpoint

Json Dumps Worker

Class: ` **Location**:models/workers/json_dumps/main.py` Purpose:

Health Checks: Available via worker health endpoint

Notification Cleanup Worker

Class: NotificationCleanupWorker Location: models/workers/notification_cleanup/main.py Purpose: Worker that periodically cleans up old notifications to enforce limits.

Health Checks: Available via worker health endpoint

Ttl Dumps Worker

Class: ` **Location**:models/workers/ttl_dumps/main.py` Purpose:

Health Checks: Available via worker health endpoint

User Stats Worker

Class: UserStatsWorker Location: models/workers/user_stats/user_stats_worker.py Purpose:

Health Checks: Available via worker health endpoint

Watchlist Consumer Worker

Class: WatchlistConsumerWorker Location: models/workers/watchlist_consumer/main.py Purpose: Worker that consumes entity change events and creates notifications for watchers.

Configuration: - kafka_bootstrap_servers: Comma-separated list of Kafka broker addresses - kafka_topic: Kafka topic for entity changes (default: "wikibase-entity-changes")

Health Checks: Available via worker health endpoint

Dependencies: Requires aiokafka for Kafka consumption.