Debug: Campaign Processing Steps

Campaign ID: 239

πŸ”— Direct Link

Step 1: Initialize Scraping Process

API Endpoint: api/start_background_scraping.php (POST)

Database Table: campaigns_impact (SELECT)

FieldValue
ID239
TitleDebeers
KeywordsDebeers
Platformstwitter youtube instagram news blogs facebook
Statusactive
Date From2026-05-19
Date To2026-05-19
Client Keywords
Monitor Links
Relevancy Threshold0.70

ℹ️ This prepares the campaign for data collection. This step must be completed before Step 2.

Step 2: Collect Data from Platforms

API Endpoint: api/collect_data_step2.php (POST)

Database Tables: twitter_raw, youtube_raw, instagram_raw, news_raw, blogs_raw, facebook_raw (INSERT, SELECT COUNT)

Database Table: background_jobs (SELECT)

ℹ️ What this means: Background jobs track long-running scraping processes. If no jobs are found, it means either:
β€’ Scraping completed and jobs were cleaned up
β€’ Scraping happened directly without creating job records (this is normal)
β€’ No background scraping has been started yet
βœ… Check Step 3 (Data Collection Status) to see if data was actually collected - that's what matters!

No background jobs found for this campaign

This is normal! Background jobs are optional tracking records. The important thing is whether data was collected - check Step 3 below.

ℹ️ This collects 100 results from each selected platform. This may take 10-20 minutes.

Data Collection Status by Platform

ℹ️ Limits: Loaded from platform_limits table (managed via Settings > Platform Limit)

PlatformTable NameRecords CountLimitStatusAction
Twittertwitter_raw11000βœ“ Data Collected
Youtubeyoutube_raw12100βœ“ Data Collected
Instagraminstagram_raw397100βœ“ Data Collected
Facebookfacebook_raw17100βœ“ Data Collected
Newsnews_raw01000⚠ No Data
Blogsblogs_raw01000⚠ No Data
Total Records427-βœ“ Data Available-

Step 3: AI Relevancy Analysis

API Endpoint: api/check_raw_mentions.php (POST)

OpenAI key: loaded from api_keys_db.api_keys (then OPENAI_API_KEY env, then config). Model: gpt-4o-mini.

Database Tables: twitter_raw, youtube_raw, instagram_raw, news_raw, blogs_raw, facebook_raw (SELECT), ai_relevancy_results (INSERT)

ℹ️ Run sends every raw row to the API with force_reprocess (full pass, re-scores by AI). Re-Run does the same. Batches of 50; large campaigns can take 15+ minutes.

AI Relevancy Results

Database Table: ai_jobs (SELECT)

No AI jobs found for this campaign. Click the button above to create AI jobs from raw data.

Total AI Relevancy Results: 400

Breakdown by Platform:

PlatformResults Count
instagram387
twitter1
youtube12

Breakdown by Relevancy:

Relevancy LabelCount

Breakdown by Sentiment:

SentimentCount
neutral107
positive262
negative31

Breakdown by Processing State:

Processing StateCount
fetched400

Sample Results (Latest 5):

IDPlatformPlatform Post IDRelevancy ScoreRelevancy LabelSentimentProcessing StateRaw Item IDRaw TableCreated At
103340instagramN/A0.8000N/ApositivefetchedN/AN/A2026-05-19 12:43:35
103365instagramN/A0.9000N/ApositivefetchedN/AN/A2026-05-19 12:43:35
103364instagramN/A0.7000N/AneutralfetchedN/AN/A2026-05-19 12:43:35
103363instagramN/A0.8500N/ApositivefetchedN/AN/A2026-05-19 12:43:35
103362instagramN/A0.9000N/ApositivefetchedN/AN/A2026-05-19 12:43:35

Step 4: Complete Processing & Save Results

API Endpoint: api/sync_campaign_articles.php (POST)

Database Table: campaign_articles (INSERT/UPDATE, SELECT COUNT, SELECT)

ℹ️ This finalizes all collected data and syncs relevant items to campaign_articles.

Campaign Articles (Final Processed Results)

Total Campaign Articles: 320

ℹ️ This will sync relevant items (relevancy score β‰₯ 70% - campaign threshold: 0.7) from ai_relevancy_results to campaign_articles. Note: 320 relevant items found, 0 can be synced (others may already exist in campaign_articles).

Breakdown by Platform:

PlatformArticles Count
instagram316
twitter1
youtube3

Sample Articles (Latest 5):

IDPlatformPlatform Post IDTitle/ContentRelevancy ScoreRelevancy LabelSentimentSentiment ScoreRaw Item IDRaw TableCreated At
51202instagram18304Our latest #jacketring - truly stunning! Why? Beca...0.9000very_highpositive0.800018304instagram_raw2026-05-19 12:43:35
51201instagram18339Follow @johnsfinancetips for more Personal Finance...0.7000highnegative0.200018339instagram_raw2026-05-19 12:43:35
51200instagram18275AND THAT’S WHAT IT LOOKS LIKE WHEN ITS FLAWLESS!!!...0.8500highpositive0.800018275instagram_raw2026-05-19 12:43:35
51203instagram18603Nerd alert: this post about the DeBeers Cullinan B...0.8500highpositive0.800018603instagram_raw2026-05-19 12:43:35
51229instagram18409I DON’T IS NOT AN OPTION!!! @debeersofficial in Ba...0.9000very_highpositive0.800018409instagram_raw2026-05-19 12:43:35

Campaign Articles Breakdown by Relevancy Label:

Relevancy LabelCount
high195
very_high125

Additional: Background Jobs Status

Database Table: background_jobs (SELECT)

ℹ️ Background jobs track long-running scraping processes. This is optional tracking.

No background jobs found for this campaign

Additional: AI Processing Jobs Status

Database Table: ai_jobs (SELECT)

No AI jobs found for this campaign

πŸ“‹ Complete Summary Report

πŸ“Š Processing Status Summary

MetricValue
Campaign ID239
Campaign Statusactive
Relevancy Threshold0.70
Total Raw Records Collected427
AI Relevancy Results400
Campaign Articles (Final)320
Background Jobs0
AI Jobs0

🌐 External APIs Used

API ServiceProviderUsage
apidojo~tweet-scraperApifyTwitter data collection (Step 3)
streamers~youtube-scraperApifyYouTube data collection (Step 3)
apify~instagram-post-scraperApifyInstagram data collection (Step 3)
ScrapingDog APIScrapingDogNews & Blogs data collection (Step 3)
OpenAI APIOpenAIAI relevancy analysis (Step 5)
Gemini APIGoogleAI relevancy analysis (Step 5, alternative)

πŸ“Š Database Tables Used

StepAPI EndpointTable NameOperation
Step 1api/start_background_scraping.phpcampaigns_impact (relevancy_threshold)SELECT
Step 2api/collect_data_step2.phptwitter_raw, youtube_raw, instagram_raw, news_raw, blogs_raw, facebook_rawINSERT, SELECT COUNT
Step 3api/check_raw_mentions.phptwitter_raw, youtube_raw, instagram_raw, news_raw, blogs_raw, facebook_raw, ai_relevancy_results (platform_post_id, content_hash, processing_state, raw_item_id, raw_table_name)SELECT, INSERT
Step 4api/sync_campaign_articles.phpcampaign_articles (platform_post_id, matched_terms, ai_reasoning, relevancy_label, sentiment_score, raw_item_id, raw_table_name)INSERT/UPDATE, SELECT COUNT, SELECT

πŸ”„ Processing Flow Summary

StepDescriptionAPI EndpointDuration
Step 1Initialize scraping processapi/start_background_scraping.php10-20 seconds
Step 2Collect data from platforms (100 results per platform)api/collect_data_step2.php10-20 minutes
Step 3AI relevancy analysis (50 mentions per batch)api/check_raw_mentions.php5-7 minutes
Step 4Complete processing & save resultsapi/sync_campaign_articles.php1-2 minutes

βœ“ Complete!

All steps debugged successfully!

Campaign ID: 239 | Raw Records: 427 | AI Results: 400 | Articles: 320