Commit Graph

42 Commits

Author SHA1 Message Date
soroush.asadi baa617daa9 Strip «آماده به کار» from role names + reject domestic-helper ads
CI/CD / CI · dotnet build (push) Successful in 2m3s
CI/CD / Deploy · hamkadr (push) Successful in 3m14s
Re-check of live applicants found two gaps:
- «کمک بهیار آماده به کار» — the availability phrase glued onto the role. StripRoleModifiers
  now removes «آماده به کار / آماده همکاری / جویای کار / جهت همکاری» phrases before
  token-stripping, so the role collapses to «کمک بهیار».
- «خانم امورسبک منزل» — light-housework domestic helpers (not کادر درمان). Validator
  now discards ads with «امور منزل / نظافت منزل / خدمتکار / مستخدم …» markers.

Both take effect for existing data on the next applicant reprocess.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-20 19:58:06 +03:30
soroush.asadi 8d0a403b36 Near-duplicate applicant detection (collapse source reposts)
CI/CD / CI · dotnet build (push) Successful in 1m57s
CI/CD / Deploy · hamkadr (push) Successful in 1m9s
Exact ContentHash dedup misses the same ad reposted with slightly different text
(e.g. the ~18 repeated «کمک‌یار آقا»). DedupeTalentAsync collapses open aggregated
applicants by two high-precision signals — identical phone, or identical
(role, city, normalized description core with digits/«… پیش» time-phrases
stripped) — keeping the newest of each group. Runs at the end of both RunAsync
and ReprocessAsync; removed count surfaces in the run log.

Improvement 1 of the data-quality/SEO backlog.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-20 17:54:26 +03:30
soroush.asadi fb7bfad9ce Reprocess: SEO-safe applicants-only default (don't churn indexed shift/job URLs)
CI/CD / CI · dotnet build (push) Successful in 2m11s
CI/CD / Deploy · hamkadr (push) Successful in 2m10s
Reprocess deletes+rebuilds aggregated listings, which changes their IDs. Shift/Job
detail pages are indexed and in the sitemap, so churning them would 404 ranked
URLs. «آماده به کار» pages are NoIndex + Disallow, so rebuilding them has zero SEO
impact — and that's where all the duplicate/sprawl problems were.

ReprocessAsync(talentOnly: true) now only deletes/rebuilds TalentListings and
skips non-talent raws (leaving shift/job listings + their RawListing links
untouched). Admin button relabelled «پردازش مجددِ آماده به کارها (امن برای SEO)».
Shifts/jobs self-clean via normal ingestion turnover.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-20 16:08:20 +03:30
soroush.asadi e582597b20 Geocoding fallback: use the registered AI model when the table can't resolve
CI/CD / CI · dotnet build (push) Successful in 1m15s
CI/CD / Deploy · hamkadr (push) Successful in 1m1s
Where deterministic geocoding gives up (neighborhood not in the TehranGeo table),
fall back to the registered AI model: the auditor now also returns approximate
lat/lng for a recognized Tehran neighborhood (folded into the existing single
audit call — no extra requests), and Publish uses it only after the source ad and
the local table, and only when it falls inside greater Tehran (InTehran bbox
guard rejects hallucinated points). Coords order: Divar point → TehranGeo → AI.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-20 15:48:42 +03:30
soroush.asadi 85a5191c45 AI qualify round 2: strip gender/seniority from roles, aide synonyms, more tag noise
CI/CD / CI · dotnet build (push) Successful in 1m17s
CI/CD / Deploy · hamkadr (push) Successful in 1m39s
Re-checked live data and found cases the first pass missed:
- Gender baked into roles («پرستار آقا», «کمک بهیار آقا») → StripRoleModifiers
  removes آقا/خانم/مرد/زن/کارآموز/ارشد… from role names (none of the real roles
  contain these), collapsing the sprawl; gender still lives on the Gender field.
- «کمک‌یار» vs «کمک بهیار» forking → alias maps them to one role.
- Personality words («خوش‌اخلاق», «دلسوز», «منظم»…) added to the tag stop-list.
- Prompt: gender goes to the gender field, not the role.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-20 15:41:06 +03:30
soroush.asadi 993c34758f Geocode neighborhood names to an approximate location (no source coords)
CI/CD / CI · dotnet build (push) Successful in 2m4s
CI/CD / Deploy · hamkadr (push) Successful in 1m54s
Many Medjobs/Telegram ads name a Tehran neighborhood («ونک», «تهرانپارس»…) but
carry no coordinates. New TehranGeo geocoder maps ~45 neighborhood names to a
rough center; Publish falls back to it (from the resolved district / AI district
/ area note) when the source ad has no point. Shown via the existing «محدودهٔ
تقریبی» circle + disclaimer — never a precise pin. Tehran-only; extends the
existing approx-coords feature so non-Divar listings can show a map too.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-20 15:31:27 +03:30
soroush.asadi 4ab6ce29c9 Approximate-location map on aggregated listings (Divar coords)
CI/CD / CI · dotnet build (push) Successful in 1m59s
CI/CD / Deploy · hamkadr (push) Successful in 1m49s
We captured Divar's privacy-fuzzed coords on RawListing but discarded them for
the listings that need them: unnamed-facility shifts/jobs dropped them (to avoid
piling on the shared placeholder) and applicants had no coordinate field at all.

- Add Lat/Lng to Shift, JobOpening, TalentListing (migration ListingApproxCoords).
- Publish stores the source ad's approx coords on each aggregated listing.
- Detail pages render the map from the listing's own coords (fallback: facility),
  and aggregated coords show as a shaded «محدودهٔ تقریبی» circle (not a precise
  pin) via _NeshanMap data-approx, with a disclaimer. Applicants get a map card
  (they had none) + the page now loads the Neshan key.

Only Divar provides coords; the map needs NeshanMapKey set in admin settings.
Existing rows get coords once reprocessed (RawListing already has them).

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-20 15:10:05 +03:30
soroush.asadi d62929ca0d AI qualify: de-dupe applicants, base roles, closed categories, tag hygiene + reprocess-stored action
CI/CD / CI · dotnet build (push) Successful in 2m35s
CI/CD / Deploy · hamkadr (push) Successful in 1m23s
Qualified live applicants and found three problems, all fixed:
- Duplicate cards: one ad fanned out into «پرستار» + «پرستار کودک» (same person).
  Applicants now publish ONE listing (no role fan-out); secondary roles → tags.
- Role sprawl: modifiers became roles. Prompt now returns the BASE profession
  and pushes age-group/ward/seniority to tags; new roles only for a genuinely
  new base profession (تکنسین داروخانه ✓, پرستار کودک ✗).
- Tag/category noise: categories pinned to the 5 fixed groups (+سایر, never
  invented); BuildTags drops pay/contact/location/fragment words.

Reprocess action: IngestionService.ReprocessAsync re-runs the current pipeline
over every stored RawListing WITHOUT re-fetching (keeps the raw text, so nothing
is lost to sources only exposing recent posts), deleting the old aggregated
posts and republishing cleanly. Admin dashboard button «پردازش مجددِ آیتم‌های
ذخیره‌شده» runs it on a background scope; result lands in the run-log.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-20 14:24:20 +03:30
soroush.asadi 38031cb189 Per-ad contacts for shifts/jobs, stale-applicant filter, review source link
CI/CD / CI · dotnet build (push) Successful in 1m3s
CI/CD / Deploy · hamkadr (push) Successful in 1m18s
Phone fix: shifts/jobs showed Facility.Phone, but unnamed ads all share one
placeholder facility, so every such listing displayed the same stale number
while the ad's real phone sat unused in the description. ContactMethod is now
attachable to a Shift/JobOpening (not just talent); ingestion stores the ad's
own number(s) on each listing and the detail pages render them (new
_ContactList partial), falling back to the facility phone only when the ad had
none. Migration ShiftJobContacts (nullable owner FKs) — auto-applies on deploy.

Stale applicants: skip «آماده به کار» posts older than 7 days at ingest, by the
source's real timestamp (Telegram <time>, Bale date) or a Persian time-ago
phrase in the text (Divar «۲ هفته پیش»). Recorded as Discarded; shifts/jobs
are not aged out.

Admin: Review page now shows a «مشاهده آگهی در منبع» link (RawListing.SourceUrl)
so the source post can be checked before publishing.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-10 21:28:12 +03:30
soroush.asadi 380243b669 Divar geo-coords to facility map + medical gate + RawListing FK/geo migrations
CI/CD / CI · dotnet build (push) Successful in 2m6s
CI/CD / Deploy · hamkadr (push) Successful in 2m3s
2026-06-09 21:38:55 +03:30
soroush.asadi cf5e0011c4 AI ingestion: dynamic role/category creation + tags, hardcoded read-only prompt
CI/CD / CI · dotnet build (push) Successful in 2m19s
CI/CD / Deploy · hamkadr (push) Successful in 2m12s
- Unknown roles from the AI are now resolved-or-CREATED (Persian-normalized dedupe) instead of dropped/fallback; new role gets the AI's category, assigned to the applicant.
- AI output gains category + tags; AI-detected skills/requirements (ICU, MMT, پروانه‌دار…) now fold into the applicant's searchable Tags.
- System prompt is hardcoded in AppSetting.DefaultPrompt and used directly by the auditor; admin sees it read-only (cannot edit/break it).

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-09 19:04:24 +03:30
soroush.asadi 59fb30ac77 AI auditor: surface the real connection error instead of swallowing it
The Test-AI button called AuditAsync, which caught every exception and returned
null, and used EnsureSuccessStatusCode() (discarding the response body). So a
failing AI service only ever produced a generic 'no response' message with no
detail — impossible to diagnose.

- Add IAiAuditor.TestAsync: runs the real call and returns a detailed Persian
  diagnostic — HTTP status + response body on non-2xx, raw body when the shape
  isn't OpenAI-compatible, and network/proxy/timeout specifics on exceptions.
- AuditAsync now logs the actual HTTP status + response body (and proxy state)
  instead of a bare warning, so server logs show why a call failed.
- ExtractContent / ParseVerdict no longer throw on unexpected JSON; they return
  null so the caller can show the raw body.
- Settings 'Test AI' button uses TestAsync; result box renders multi-line and
  switches to alert-error styling when the test fails.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-09 18:30:12 +03:30
soroush.asadi 6b657c7795 Applicants: auto-tags + deep search w/ highlight; never delete (archive instead)
CI/CD / CI · dotnet build (push) Successful in 2m1s
CI/CD / Deploy · hamkadr (push) Successful in 2m36s
- Tags: parser extracts cert/skill keywords (mmt, ICU/CCU, دیالیز, اتاق عمل,
  اورژانس, مسئول فنی, پروانه‌دار…) + role + city into TalentListing.Tags
  (+ migration); shown as chips on cards.
- Deep search on /Talent: «جستجوی عمیق» box does Postgres ILIKE across
  tags, description, person, area, role, city (every term must match);
  matches are highlighted with <mark> via SearchHighlight.
- Never delete: ShiftStatus.Archived + the admin «بایگانی گروهی» action now
  ARCHIVES aggregated posts (hidden from site, kept in DB) and leaves the
  raw crawl rows intact — a permanent archive for future analytics.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-08 11:25:32 +03:30
soroush.asadi e4dc5180ad Applicants: 1→N contact methods with types (phone/email/Instagram/Telegram/Bale/site)
CI/CD / CI · dotnet build (push) Successful in 1m32s
CI/CD / Deploy · hamkadr (push) Successful in 1m31s
- ContactMethod entity (Type + Value + SortOrder) 1→N on TalentListing (+ migration).
- Parser extracts ALL contacts: multiple phones + landlines, email, and
  socials (Instagram/Telegram/Bale/WhatsApp/website) via URLs and Persian
  keyword cues; primary Phone kept for cards.
- ContactInfo helper: per-type label/icon/clickable href (tel:/mailto:/t.me/…).
- Ingestion attaches contacts to each (fanned-out) talent listing; manual
  Review re-parses to attach them + the admin-typed phone.
- Talent details renders the full contact list as buttons; falls back to the
  single phone, then the Divar source link.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-08 11:10:19 +03:30
soroush.asadi 48760c4e83 Multi-role ads: parse all roles + fan-out publish one listing per role
CI/CD / CI · dotnet build (push) Successful in 2m16s
CI/CD / Deploy · hamkadr (push) Has been cancelled
An ad like «استخدام پرستار سالمند و کودک و همراه بیمار» names several roles;
we kept only the first. Now:
- Parser collects ALL roles (ParsedListing.RoleNames): exact taxonomy
  matches (substring-deduped so پرستار⊂پرستار سالمندان) plus synonyms
  (سالمند→پرستار سالمندان, کودک/همراه بیمار→پرستار, اتاق عمل→تکنسین اتاق عمل…),
  capped at 4.
- Ingestion publishes one Shift/Job/Talent per resolved role (AI role +
  parser roles, distinct, capped), so each role is independently
  browsable and filterable. RawListing dedupe is unchanged (one raw → N posts).

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-08 10:58:29 +03:30
soroush.asadi 13e00ec011 Validator: phone optional for applicants (publish + redirect to Divar)
CI/CD / CI · dotnet build (push) Successful in 3m10s
CI/CD / Deploy · hamkadr (push) Successful in 4m8s
A Divar applicant whose number is behind the login-gated reveal should
still publish — the detail page already links back to Divar for the phone.
Talent now scores role(40)+medical(10)=50, so role+medical alone passes
without a phone; phone just adds confidence.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-08 09:32:48 +03:30
soroush.asadi 386e25c8fd Validator: discard promotional/training ads (workshops, courses)
CI/CD / CI · dotnet build (push) Has been cancelled
CI/CD / Deploy · hamkadr (push) Has been cancelled
Medical-flavored ads like «کارگاه بوتاکس و فیلر… ویژه پزشکان ۱۰٪» passed the
medical gate and got misclassified as a پزشک عمومی shift with a bogus 10%
share. Now: if a course/event/product marker is present and there's no
staffing intent (hiring/shift/availability), the item is auto-discarded.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-08 09:30:23 +03:30
soroush.asadi 2bb8771ade Normalize ریال→تومان pricing; stop exposing crawl source (medjobs/telegram)
CI/CD / CI · dotnet build (push) Successful in 29s
CI/CD / Deploy · hamkadr (push) Successful in 42s
- Parser now reads the currency: ریال amounts (incl. «میلیون ریال» and
  numbers with no تومان unit but ≥200M) are converted to تومان (÷10), so
  «۴۰۰٬۰۰۰٬۰۰۰ ریال» shows as ۴۰٬۰۰۰٬۰۰۰ تومان instead of 400M.
- Aggregated facility fallback name no longer embeds the source
  («مرکز درمانی (از مدجابز)» → «مرکز درمانی (نامشخص)»).
- Talent details only ever names Divar as a fallback source (when the
  number couldn't be extracted); medjobs/telegram are never shown publicly.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-08 09:05:34 +03:30
soroush.asadi 490821a637 Talent lifecycle (21-day expiry) + noindex expired job/shift details
CI/CD / CI · dotnet build (push) Successful in 2m24s
CI/CD / Deploy · hamkadr (push) Successful in 2m47s
- Talent «آماده به کار» now has its own freshness window (21 days, vs 30
  for jobs) since availability goes stale fast; archiver, browse, and home
  use TalentCutoffUtc.
- Expired/filled job openings and past/filled shifts now emit
  robots noindex so Google drops dead listings instead of keeping
  soft-404 pages. (Talent details were already noindex.)

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-08 08:59:54 +03:30
soroush.asadi 0622270cd2 Fix: site-wide phone on every Medjobs ad + phone mistaken for price
CI/CD / CI · dotnet build (push) Successful in 2m7s
CI/CD / Deploy · hamkadr (push) Successful in 1m59s
- HarvestPhones was run over the whole page, so Medjobs' own header/footer
  number (09101016110) was appended to every ad. Now harvest only the ad's
  description region in Medjobs + Website sources; the protected number
  still comes from the reveal call. No more duplicate number across ads.
- The amount extractor read phone digits as a Toman price
  (۹,۱۰۱,۰۱۶,۱۱۰ تومان). The parser now strips «شماره تماس…» lines and
  mobile/landline numbers before extracting money, and only accepts 6–10
  digit numbers with no leading zero (phones/ids start with 0 or are 11+).

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-08 08:42:21 +03:30
soroush.asadi a5d6e212e2 Divar: capture post token + harvest phone from full ad detail
CI/CD / CI · dotnet build (push) Successful in 2m4s
CI/CD / Deploy · hamkadr (push) Successful in 2m18s
- Harvest now keeps each post's token, so we build a real post URL
  (divar.ir/v/{token}) instead of a generic link.
- For each post we fetch the detail JSON (posts-v2/web/{token}) and
  harvest any contact number from it — covering the very common case
  where the poster writes the phone into the ad description. Divar's
  click-to-reveal is login-gated, so this gets the in-text numbers
  without auth; fails soft (blocking/errors → skip).
- HarvestPhones hardened with digit-boundary guards so it can't grab a
  slice of a longer numeric id/timestamp inside JSON.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-08 08:28:37 +03:30
soroush.asadi d238888710 Medjobs: reveal hidden contact number via admin-ajax during crawl
CI/CD / CI · dotnet build (push) Successful in 1m16s
CI/CD / Deploy · hamkadr (push) Successful in 2m14s
The contact phone on medjobs.ir is loaded by JS only after clicking
«تماس با این آگهی» — it isn't in the page HTML, so scanning the markup
found nothing. We now replay that exact reveal request server-side:

- POST https://medjobs.ir/wp-admin/admin-ajax.php with
  action=isatis_protect_contact & id=<listingId> (no nonce needed),
  then harvest the tel: numbers from the returned HTML table.
- Listing id is pulled from the page via the WP shortlink (?p=ID),
  postid-/data-id, or the visible «کد آگهی» as a fallback.
- Numbers are appended to the ad text so the parser/AI capture them and
  they reach the published listing. Wrapped in try/catch so a failed
  reveal never breaks ingestion; uses the same (proxy-aware, brotli-
  decompressing) client as the page fetch.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-08 08:21:24 +03:30
soroush.asadi 213af9db48 AI tag/category assignment + phone extraction from web ads
CI/CD / CI · dotnet build (push) Successful in 2m37s
CI/CD / Deploy · hamkadr (push) Successful in 1m11s
AI (when enabled, now that the server proxy is up):
- AiStructured gains phone, personName, yearsExperience, isLicensed.
- The auditor appends an authoritative output-schema to the admin prompt
  so classification stays correct even with an older stored prompt — it
  now classifies kind as shift|job|talent and extracts the contact phone
  and talent details.
- Ingestion publish prefers the AI's tags (kind/role/city/facility/phone +
  talent fields) over the heuristic parser when present.
- Default prompt updated to describe the three kinds + new fields.

Phone extraction from websites (Medjobs / generic sites), where the
number sits behind a "تماس با این آگهی" reveal:
- HtmlUtil.HarvestPhones scans the full markup for tel: links, JSON-LD
  "telephone", data-*phone* attributes, and inline Iranian mobile/landline
  numbers (Persian digits folded), normalized (mobiles 09…, landlines 0…).
- Medjobs + Website sources append harvested numbers to the ad text so the
  parser/AI capture them; manual review then prefills the phone too.
- Parser phone extraction now also captures a landline as a fallback.

Note: if a site loads the number purely via XHR (not in HTML), a
per-source reveal endpoint would be a follow-up.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-08 08:11:14 +03:30
soroush.asadi 4e5df73cf7 Add «آماده به کار» (talent) listing type — workers offering themselves
CI/CD / CI · dotnet build (push) Successful in 1m41s
CI/CD / Deploy · hamkadr (push) Has been cancelled
Adds a third listing kind alongside Shift/Job for healthcare staff who
advertise their own availability (very common in Iranian medical
channels, e.g. "دندانپزشک آماده همکاری… ۵۰٪ تسویه"). These have no
facility; the contact phone is the key field.

- Model: TalentListing (role, person name, years, licensed, city/district,
  area note, availability, gender, comp, phone) + ListingKind.Talent +
  RawListing.LinkedTalentId + DbSet/relations/indexes + EF migration.
- Parser: detect آماده‌به‌کار/جویای کار → Kind=Talent; extract person name,
  years of experience, licensed flag, area («منطقه ۱»), phone. Facility
  name extraction now skipped for talent.
- Validator: talent path scores role + phone + medical (no facility/pay
  required).
- Ingestion auto-publish: creates a TalentListing for talent kind.
- Review (manual publish): Talent option + talent fields; publishes a
  TalentListing without a facility. Shift/Job facility now falls back to a
  shared «نامشخص / ثبت نشده» record when the ad names none — publishing
  never fails on a missing facility.
- Browse /Talent (indexable, filters: city/district/role/gender),
  details /Talent/Details (noindex — personal contact, tel: call button),
  _TalentCard, badge-talent, nav link, home section.
- Sitemap includes /Talent; robots disallows /Talent/Details. Archiver
  expires stale talent listings.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-08 08:01:12 +03:30
soroush.asadi e6a796ab27 Match crawled listings to existing facilities (fuzzy) before creating new
CI/CD / CI · dotnet build (push) Successful in 1m28s
CI/CD / Deploy · hamkadr (push) Successful in 2m24s
When publishing a scraped listing we now look for a facility we already
have that is exactly or closely the same, and only create a new one when
there is no match — avoiding duplicates like «بیمارستان میلاد» vs «میلاد».

- ListingParser: extract a facility name (keyword + distinctive words) from
  the post and surface it in the parser notes.
- FacilityMatcher: Persian-aware normalization (ي/ك, ZWNJ, punctuation),
  type-word stripping for a "core" name, contains + Levenshtein similarity,
  and FindBest (same-city exact → any-city exact → same-city fuzzy → fuzzy).
- Review (manual publish): auto-select a matching facility or prefill the
  new-facility name; resolve-or-create uses fuzzy match; dropdown preselects.
- IngestionService (auto-publish): reuse FacilityMatcher against a run-wide
  facility list (grows as new ones are created) instead of exact-name only.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-08 07:14:48 +03:30
soroush.asadi 487c7ca82f [Ingest] Persistent crawl run-log + per-source breakdown on admin queue
CI/CD / CI · dotnet build (push) Has been cancelled
CI/CD / Deploy · hamkadr (push) Has been cancelled
Each ingestion run now records an IngestionRun row (found/queued/published/flagged/spam/duplicates + a per-source detail string). Admin → صف آگهی‌ها shows a «تاریخچه جمع‌آوری» table of the last 15 runs (hover a row for the per-source breakdown), so admins can see how much each source found vs added over time. IngestionSummary gains TotalFetched. Migration: IngestionRuns table.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-08 06:23:58 +03:30
soroush.asadi 524c66e25e [Admin] VPN/proxy + AI test buttons; fix AI JSON parse crash on null fields
CI/CD / CI · dotnet build (push) Successful in 2m41s
CI/CD / Deploy · hamkadr (push) Failing after 2m56s
Add «تست اتصال VPN/پروکسی» (reaches a filtered site through the proxy and reports connected/latency) and «تست هوش مصنوعی» (sends a sample post through the configured model and shows the verdict + extracted fields) to admin Settings. Fix OpenAiCompatibleAuditor.ParseVerdict: TryGetInt32/64 threw on null/string JSON values (the model commonly returns payAmount/sharePercent as null), which silently failed every audit — now guarded on ValueKind==Number. Verified the real OpenAI key extracts perfectly (approve / role=پرستار / city=تهران / shift=night).

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-07 23:23:02 +03:30
soroush.asadi 0c49b89891 [AI] Route AI calls through the Xray/V2Ray proxy (reach OpenAI from Iran)
CI/CD / CI · dotnet build (push) Successful in 1m46s
CI/CD / Deploy · hamkadr (push) Failing after 1m58s
Add AiUseProxy setting + a toggle in the AI settings section. ScrapeHttpClients.ForAi(settings) returns a proxied HttpClient (reusing IngestProxyUrl, 100s timeout) when AiUseProxy is on, otherwise direct; AI-cache keys are protected from the scrape-client cleanup. OpenAiCompatibleAuditor now uses it, so the AI auditor (e.g. api.openai.com) is reachable through the same Xray sidecar that serves Telegram. Migration adds the column.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-07 22:55:07 +03:30
soroush.asadi 018c0f0286 [Ingest] Tune parser/validator for real Divar+Medjobs data
CI/CD / CI · dotnet build (push) Successful in 2m53s
CI/CD / Deploy · hamkadr (push) Failing after 2m39s
Analyzed live Divar (POST search) and Medjobs (ad_listing sitemaps) data — both are free Persian text. Tighten the medical-relevance gate (drop generic «استخدام»/«شیفت» that match retail/restaurant ads; add clinical terms: بهیار/اتاق عمل/بیهوشی/رادیولوژی/آزمایشگاه/دیالیز/فوریت/تریاژ/… ) so off-topic Divar jobs get flagged, not treated as medical. Add clinical role synonyms in the heuristic parser (بهیار/کمک‌پرستار/سالمند→پرستار, اتاق عمل→تکنسین اتاق عمل, فوریت→فوریت‌های پزشکی, آزمایشگاه→کارشناس آزمایشگاه, فوق‌تخصص→پزشک متخصص…). Result on live data: Medjobs now yields ~9/30 queue-ready healthcare listings; Divar correctly flags ~72/75 noise for manual review.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-07 22:34:05 +03:30
soroush.asadi 2485173aad [Ingest] Fix Divar: use POST search API (GET was anti-bot blocked)
CI/CD / CI · dotnet build (push) Successful in 1m51s
CI/CD / Deploy · hamkadr (push) Successful in 2m8s
Divar's /v8/web-search GET returns a BLOCKING_VIEW (anti-bot), so the old source pulled nothing useful and could scrape the block message. Switch to the working POST /v8/postlist/w/search with a browser User-Agent and a city-id map (numeric id passthrough; tehran=1 default). Skip responses that are non-2xx or contain BLOCKING_VIEW so the block page is never ingested. Verified locally: fetched 25 real Tehran job posts into the review queue, 0 block messages.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-07 21:23:36 +03:30
soroush.asadi b1e474ba33 [Ingest] Per-source proxy toggle instead of one global switch
CI/CD / CI · dotnet build (push) Successful in 56s
CI/CD / Deploy · hamkadr (push) Successful in 1m6s
Each ingestion source now decides independently whether to route through the proxy: added TelegramUseProxy/BaleUseProxy/DivarUseProxy/MedjobsUseProxy/WebsitesUseProxy flags (migration). ScrapeHttpClients.For(s, useProxy) takes the source's own flag; a source is proxied only when its flag is on AND a proxy URL is set. Settings 'sources' tab: removed the global enable checkbox, kept the proxy address field, and added an «از پروکسی استفاده شود» checkbox under each source. Old IngestProxyEnabled column kept for compatibility but no longer gates routing.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-04 18:46:48 +03:30
soroush.asadi cea27c8684 [Ingest] Route scraping through an optional V2Ray/Xray proxy (Telegram in Iran)
CI/CD / CI · dotnet build (push) Successful in 53s
CI/CD / Deploy · hamkadr (push) Successful in 1m12s
Telegram and some sources are filtered in Iran. .NET cannot speak vmess/vless/trojan, so add an Xray sidecar (compose service 'xray', behind the 'proxy' profile) that converts the admin's config into a local SOCKS5 proxy (xray:10808). New ScrapeHttpClients provider builds a proxied or direct HttpClient (WebProxy supports socks5/socks4/http) cached per proxy URL; all five ingestion sources (Telegram/Bale/Divar/Medjobs/Websites) now use it. Admin settings gain IngestProxyEnabled + IngestProxyUrl (migration; UI under sources). Added deploy/xray/config.json template + README with vmess/vless/trojan examples.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-04 17:53:17 +03:30
soroush.asadi 8fad9c1bb6 [Admin] Notification channel toggles (web/SMS/push active-deactive)
CI/CD / CI · dotnet build (push) Successful in 50s
CI/CD / Deploy · hamkadr (push) Successful in 1m1s
Add a 'notification channels' card at the top of admin Settings with three master on/off checkboxes: web/in-app (new WebNotificationsEnabled, default true), SMS (existing SmsEnabled), and Web Push (existing PushEnabled). Removed the duplicate enable checkboxes from the SMS and Push sections so each binds once. NotificationService now gates the in-app + live SSE channel on WebNotificationsEnabled; push self-gates on PushEnabled. Migration defaults the new column to true so existing installs keep web notifications on.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-04 15:56:40 +03:30
soroush.asadi 0c0449c2b9 [Demo] Add admin demo-mode toggle + generic website ingest source
CI/CD / CI · dotnet build (push) Failing after 1m40s
CI/CD / Deploy · hamkadr (push) Has been skipped
- AppSetting: DemoMode, WebsitesEnabled, WebsiteUrls
- Facility.IsDemo flag; SeedData split into SeedReferenceAsync (always)
  + SeedDemoAsync/ClearDemoAsync (idempotent, toggleable at runtime)
- WebsiteListingSource: scrape any admin-configured URL (og:title + content)
- Admin Settings: seed/clear demo card, demo-mode checkbox, website source
  fields; Program.cs seeds demo when DemoMode on (or in Development)
- EF migration DemoModeAndWebsites

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-04 13:43:07 +03:30
soroush.asadi a02eb6a985 PWA: installable app (web/win/android/ios) + download/help page + push notifications
CI/CD / CI · dotnet build (push) Successful in 40s
CI/CD / Deploy · hamkadr (push) Successful in 55s
- manifest.webmanifest + service worker (offline shell + push + notificationclick) + PNG icons (192/512/apple) + iOS meta + SW registration → installable everywhere
- /Download page: per-OS install help (web/windows/android/ios), install button (beforeinstallprompt), 'enable notifications' flow, usage guide, Bazaar/TWA note; nav + footer links
- Web Push foundation: WebPushSubscription entity + /push/subscribe (stores), VAPID + push settings in /Admin/Settings, on-device local notification; server broadcast documented (WebPush via Nexus)
- docs/PWA-TWA.md: VAPID keygen, server-push wiring, Bubblewrap→Cafe Bazaar + assetlinks steps
- Verified: manifest/sw/icons served, download page, subscribe stores (200), layout wired

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-04 11:23:13 +03:30
soroush.asadi 9a92da42e6 Facility location: click-to-pick Neshan map + 'my current location'
CI/CD / CI · dotnet build (push) Successful in 37s
CI/CD / Deploy · hamkadr (push) Successful in 40s
- RegisterFacility: '📍 موقعیت فعلی من' (browser geolocation, always available) + Neshan Leaflet map (click/drag marker → fills lat/lng) when a Neshan web key is set; graceful fallback to manual coords without a key
- AppSetting.NeshanMapKey configured in /Admin/Settings (Google Maps is blocked in Iran); migration
- Verified: location button + inputs render always; map + SDK render once the key is saved

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-04 10:47:33 +03:30
soroush.asadi 17d38431bf Add SEO sitemap/robots + real SMS OTP (Kavenegar, admin-configured)
CI/CD / CI · dotnet build (push) Successful in 31s
CI/CD / Deploy · hamkadr (push) Successful in 56s
- /sitemap.xml (static pages + open shifts + fresh jobs, respecting expiry) + /robots.txt (blocks /Admin,/Employer); base URL from forwarded request → https://hamkadr.ir in prod
- ISmsSender + KavenegarSmsSender (verify/lookup template, sms/send fallback); SMS settings (enabled/apikey/template/sender) in /Admin/Settings; OtpService.IssueAsync sends SMS and stops revealing the code when enabled (dev still shows it); migration

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-04 10:27:21 +03:30
soroush.asadi 6d2ad6f87e Hide + archive stale listings (old jobs, expired shifts)
CI/CD / CI · dotnet build (push) Successful in 37s
CI/CD / Deploy · hamkadr (push) Successful in 48s
- ListingPolicy.JobFreshnessDays=30: public /Jobs and home hide jobs older than the cutoff (shifts already require Date>=today)
- ListingArchiver flips stale Open→Expired: shifts past their date, jobs older than the cutoff. Runs at startup and on every IngestionWorker cycle (independent of ingestion being enabled)
- Verified: backdated job dropped off /Jobs (6→5) and was archived to Expired on the sweep

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-04 09:57:06 +03:30
soroush.asadi e2e26150cb Add medjobs.ir scraper + employer/employee choice at signup
CI/CD / CI · dotnet build (push) Successful in 1m22s
CI/CD / Deploy · hamkadr (push) Successful in 1m37s
- MedjobsListingSource: crawls medjobs.ir sitemaps (ad_listing-sitemapN) → fetches ad pages → title+description → engine (dedupe/parse/validate/publish as SEO job pages). Configured in /Admin/Settings (enable + max ads/run).
- Login/register now asks 'کادر درمان' vs 'کارفرما/مرکز': new accounts get Doctor vs FacilityAdmin role; post-login routes to /Me, /Employer, or /Admin accordingly.
- Verified live: medjobs run fetched real ads into the review queue; employer signup → /Employer.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-04 06:12:10 +03:30
soroush.asadi 3c08c1a265 Move ingestion + Telegram/Bale/Divar config to DB-backed admin settings
CI/CD / CI · dotnet build (push) Successful in 6m22s
CI/CD / Deploy · hamkadr (push) Failing after 3s
- AppSetting gains source config: AutoIngestEnabled, IngestIntervalMinutes, Telegram/Bale/Divar enabled+channels/token/queries
- IListingSource.FetchAsync(AppSetting) — sources read config from DB, not IOptions/appsettings; sample source dev-only
- IngestionWorker reads AutoIngest+interval from DB each cycle (toggle at runtime, no redeploy)
- /Admin/Settings gets a 'منابع جمع‌آوری' section; removed Ingestion env/appsettings + compose env vars
- ENV_FILE shrinks to HOST_PORT + POSTGRES_* + ADMIN_PHONE (AI + sources are all in-admin); migration

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-04 00:44:11 +03:30
soroush.asadi 36bb165438 Real channel fetch (Telegram/Bale/Divar) + AI-audited automation engine + CI/CD
- Fetch: Telegram via t.me/s, Bale via Bot API, Divar via web-search (HttpClient, config-gated, graceful)
- AI layer: DB-backed AppSetting (mode auto/manual, thresholds, AI endpoint/model/key/prompt/framework, auto-approve); OpenAI-compatible IAiAuditor (self-host/Iranian endpoints; fails safe to manual)
- Pipeline: fetch → dedupe(hash) → parse → validate → AI audit → Discard/Flag/Queue/auto-publish (resolve-or-create facility)
- Admin: /Admin/Settings automation+AI panel; queue shows confidence + AI verdict; flagged section
- CI/CD: Dockerfile, docker-compose.prod.yml, .gitea/workflows/ci-cd.yml, nginx vhost, DEPLOY.md; forwarded headers + /healthz + prod reference-only seed; ports 22/80/443 only

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-03 17:41:02 +03:30
soroush.asadi 931b7b6ffb Add scrape/ingestion engine + validation, and 24h shift hour-range visualization
Scrape engine (Services/Scraping/): pluggable IListingSource (working sample + Telegram/Divar credential-ready stubs) → IngestionService (content-hash dedupe → parse → validate → review queue) → ListingValidator (completeness score + spam screen) → IngestionWorker (config-gated hosted service). RawListing gains ContentHash/Confidence/ValidationNotes; RawListingStatus.Flagged. Admin /Admin gets run-now, source list, confidence + flagged queue.

Hour-range viz: _HourBar 24h timeline bar (colored by type, overnight wrap) on shift cards, recommendation cards, and detail.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-03 08:18:19 +03:30