Add iranestekhdam.ir as an ingestion source (clinical job ads at named facilities)
CI/CD / CI · dotnet build (push) Successful in 1m43s
CI/CD / Deploy · hamkadr (push) Successful in 1m55s

New IranEstekhdamListingSource: reads the site monthly ad sitemaps
(sitemap-ads.xml -> sitemap-ads-YYYY-M.xml), keeps only ad URLs whose Persian slug names a
clinical role (veterinary/non-clinical excluded), then extracts each ad title + description
(+ phone). These are employer ads at NAMED facilities, so they directly improve the
unknown-facility problem the classifieds content has.

Wired in like Medjobs: AppSetting toggles (IranEstekhdamEnabled/MaxAds/UseProxy) + EF
migration, SettingsService persistence, admin Settings UI, and DI registration. Off by
default; the medical-gate validator + AI auditor + junk filters screen results downstream.

Note: e-estekhdam / jobinja / jobvision are JS-rendered SPAs whose ad lists are not in static
HTML, so they need API reverse-engineering (a separate effort), not this static-scrape path.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
This commit is contained in:
soroush.asadi
2026-06-21 07:39:39 +03:30
parent da55f82c6c
commit f118db55ef
9 changed files with 1869 additions and 0 deletions
@@ -55,6 +55,9 @@ public class SettingsService
s.DivarQueries = incoming.DivarQueries?.Trim();
s.MedjobsEnabled = incoming.MedjobsEnabled;
s.MedjobsMaxAds = Math.Clamp(incoming.MedjobsMaxAds, 1, 500);
s.IranEstekhdamEnabled = incoming.IranEstekhdamEnabled;
s.IranEstekhdamMaxAds = Math.Clamp(incoming.IranEstekhdamMaxAds, 1, 500);
s.IranEstekhdamUseProxy = incoming.IranEstekhdamUseProxy;
s.SmsEnabled = incoming.SmsEnabled;
s.SmsApiKey = incoming.SmsApiKey?.Trim();
s.SmsTemplate = incoming.SmsTemplate?.Trim();