Commit Graph

10 Commits

Author SHA1 Message Date
soroush.asadi 48760c4e83 Multi-role ads: parse all roles + fan-out publish one listing per role
CI/CD / CI · dotnet build (push) Successful in 2m16s
CI/CD / Deploy · hamkadr (push) Has been cancelled
An ad like «استخدام پرستار سالمند و کودک و همراه بیمار» names several roles;
we kept only the first. Now:
- Parser collects ALL roles (ParsedListing.RoleNames): exact taxonomy
  matches (substring-deduped so پرستار⊂پرستار سالمندان) plus synonyms
  (سالمند→پرستار سالمندان, کودک/همراه بیمار→پرستار, اتاق عمل→تکنسین اتاق عمل…),
  capped at 4.
- Ingestion publishes one Shift/Job/Talent per resolved role (AI role +
  parser roles, distinct, capped), so each role is independently
  browsable and filterable. RawListing dedupe is unchanged (one raw → N posts).

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-08 10:58:29 +03:30
soroush.asadi 2bb8771ade Normalize ریال→تومان pricing; stop exposing crawl source (medjobs/telegram)
CI/CD / CI · dotnet build (push) Successful in 29s
CI/CD / Deploy · hamkadr (push) Successful in 42s
- Parser now reads the currency: ریال amounts (incl. «میلیون ریال» and
  numbers with no تومان unit but ≥200M) are converted to تومان (÷10), so
  «۴۰۰٬۰۰۰٬۰۰۰ ریال» shows as ۴۰٬۰۰۰٬۰۰۰ تومان instead of 400M.
- Aggregated facility fallback name no longer embeds the source
  («مرکز درمانی (از مدجابز)» → «مرکز درمانی (نامشخص)»).
- Talent details only ever names Divar as a fallback source (when the
  number couldn't be extracted); medjobs/telegram are never shown publicly.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-08 09:05:34 +03:30
soroush.asadi 0622270cd2 Fix: site-wide phone on every Medjobs ad + phone mistaken for price
CI/CD / CI · dotnet build (push) Successful in 2m7s
CI/CD / Deploy · hamkadr (push) Successful in 1m59s
- HarvestPhones was run over the whole page, so Medjobs' own header/footer
  number (09101016110) was appended to every ad. Now harvest only the ad's
  description region in Medjobs + Website sources; the protected number
  still comes from the reveal call. No more duplicate number across ads.
- The amount extractor read phone digits as a Toman price
  (۹,۱۰۱,۰۱۶,۱۱۰ تومان). The parser now strips «شماره تماس…» lines and
  mobile/landline numbers before extracting money, and only accepts 6–10
  digit numbers with no leading zero (phones/ids start with 0 or are 11+).

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-08 08:42:21 +03:30
soroush.asadi 213af9db48 AI tag/category assignment + phone extraction from web ads
CI/CD / CI · dotnet build (push) Successful in 2m37s
CI/CD / Deploy · hamkadr (push) Successful in 1m11s
AI (when enabled, now that the server proxy is up):
- AiStructured gains phone, personName, yearsExperience, isLicensed.
- The auditor appends an authoritative output-schema to the admin prompt
  so classification stays correct even with an older stored prompt — it
  now classifies kind as shift|job|talent and extracts the contact phone
  and talent details.
- Ingestion publish prefers the AI's tags (kind/role/city/facility/phone +
  talent fields) over the heuristic parser when present.
- Default prompt updated to describe the three kinds + new fields.

Phone extraction from websites (Medjobs / generic sites), where the
number sits behind a "تماس با این آگهی" reveal:
- HtmlUtil.HarvestPhones scans the full markup for tel: links, JSON-LD
  "telephone", data-*phone* attributes, and inline Iranian mobile/landline
  numbers (Persian digits folded), normalized (mobiles 09…, landlines 0…).
- Medjobs + Website sources append harvested numbers to the ad text so the
  parser/AI capture them; manual review then prefills the phone too.
- Parser phone extraction now also captures a landline as a fallback.

Note: if a site loads the number purely via XHR (not in HTML), a
per-source reveal endpoint would be a follow-up.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-08 08:11:14 +03:30
soroush.asadi 4e5df73cf7 Add «آماده به کار» (talent) listing type — workers offering themselves
CI/CD / CI · dotnet build (push) Successful in 1m41s
CI/CD / Deploy · hamkadr (push) Has been cancelled
Adds a third listing kind alongside Shift/Job for healthcare staff who
advertise their own availability (very common in Iranian medical
channels, e.g. "دندانپزشک آماده همکاری… ۵۰٪ تسویه"). These have no
facility; the contact phone is the key field.

- Model: TalentListing (role, person name, years, licensed, city/district,
  area note, availability, gender, comp, phone) + ListingKind.Talent +
  RawListing.LinkedTalentId + DbSet/relations/indexes + EF migration.
- Parser: detect آماده‌به‌کار/جویای کار → Kind=Talent; extract person name,
  years of experience, licensed flag, area («منطقه ۱»), phone. Facility
  name extraction now skipped for talent.
- Validator: talent path scores role + phone + medical (no facility/pay
  required).
- Ingestion auto-publish: creates a TalentListing for talent kind.
- Review (manual publish): Talent option + talent fields; publishes a
  TalentListing without a facility. Shift/Job facility now falls back to a
  shared «نامشخص / ثبت نشده» record when the ad names none — publishing
  never fails on a missing facility.
- Browse /Talent (indexable, filters: city/district/role/gender),
  details /Talent/Details (noindex — personal contact, tel: call button),
  _TalentCard, badge-talent, nav link, home section.
- Sitemap includes /Talent; robots disallows /Talent/Details. Archiver
  expires stale talent listings.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-08 08:01:12 +03:30
soroush.asadi e6a796ab27 Match crawled listings to existing facilities (fuzzy) before creating new
CI/CD / CI · dotnet build (push) Successful in 1m28s
CI/CD / Deploy · hamkadr (push) Successful in 2m24s
When publishing a scraped listing we now look for a facility we already
have that is exactly or closely the same, and only create a new one when
there is no match — avoiding duplicates like «بیمارستان میلاد» vs «میلاد».

- ListingParser: extract a facility name (keyword + distinctive words) from
  the post and surface it in the parser notes.
- FacilityMatcher: Persian-aware normalization (ي/ك, ZWNJ, punctuation),
  type-word stripping for a "core" name, contains + Levenshtein similarity,
  and FindBest (same-city exact → any-city exact → same-city fuzzy → fuzzy).
- Review (manual publish): auto-select a matching facility or prefill the
  new-facility name; resolve-or-create uses fuzzy match; dropdown preselects.
- IngestionService (auto-publish): reuse FacilityMatcher against a run-wide
  facility list (grows as new ones are created) instead of exact-name only.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-08 07:14:48 +03:30
soroush.asadi 018c0f0286 [Ingest] Tune parser/validator for real Divar+Medjobs data
CI/CD / CI · dotnet build (push) Successful in 2m53s
CI/CD / Deploy · hamkadr (push) Failing after 2m39s
Analyzed live Divar (POST search) and Medjobs (ad_listing sitemaps) data — both are free Persian text. Tighten the medical-relevance gate (drop generic «استخدام»/«شیفت» that match retail/restaurant ads; add clinical terms: بهیار/اتاق عمل/بیهوشی/رادیولوژی/آزمایشگاه/دیالیز/فوریت/تریاژ/… ) so off-topic Divar jobs get flagged, not treated as medical. Add clinical role synonyms in the heuristic parser (بهیار/کمک‌پرستار/سالمند→پرستار, اتاق عمل→تکنسین اتاق عمل, فوریت→فوریت‌های پزشکی, آزمایشگاه→کارشناس آزمایشگاه, فوق‌تخصص→پزشک متخصص…). Result on live data: Medjobs now yields ~9/30 queue-ready healthcare listings; Divar correctly flags ~72/75 noise for manual review.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-07 22:34:05 +03:30
soroush.asadi 6cfdd16c42 Add gender requirement (آقا/خانم/فرقی نمی‌کند) + employee (کارجو) panel
CI/CD / CI · dotnet build (push) Successful in 6m23s
CI/CD / Deploy · hamkadr (push) Failing after 6m30s
- Gender enum + GenderRequirement on Shift/JobOpening + Gender on UserPreferences (migration)
- Employer PostShift/PostJob + admin Review have a gender select; parser detects آقا/خانم/مرد/زن
- Gender badge on cards + detail; gender filter on Shifts/Jobs; gender in preferences
- Recommendations exclude listings whose gender requirement conflicts with the person's gender
- Two panels: new /Me employee (کارجو) panel (recommendations + saved + applied + prefs) alongside /Employer; nav routes by role; /Account/Profile → /Me

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-04 00:19:32 +03:30
soroush.asadi 563a40d1f4 Add hiring, AI parser+admin, OTP auth, employer dashboard, profit-share pay
- Hiring (استخدام) listings: JobOpening + /Jobs browse/detail + home section
- Heuristic Persian listing-parser + admin queue (/Admin) → publish shift/job
- Phone-OTP cookie auth + visitor-history linking + profile; Admin role gate
- Employer side: self-serve facility registration, dashboard, post/manage shifts & jobs, applicants list with contact
- Compensation models: fixed / hourly / profit-share (درصدی) / negotiable / choice (به انتخاب شما); SharePercent + JalaliDate.PayLabel; parser + filter

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-03 06:26:54 +03:30
soroush.asadi 2fb86a435e Initial commit — Hamkadr (همکادر) healthcare-staffing marketplace
ASP.NET Core 10 Razor Pages + PostgreSQL/EF Core. RTL Persian, Jalali dates, self-hosted Vazirmatn, teal/coral brand.

Features:
- Shift listings: browse/filter (city, district, role, type, pay), weekly Jalali calendar, detail + interest handoff, near-me distance sort
- Hiring (استخدام) listings with employment type + salary range
- Pattern-engine recommendations + anonymous interest tracking (visitor cookie)
- Heuristic Persian listing-parser + admin queue (raw channel post → shift/job)
- Phone-OTP cookie auth + visitor-history linking + profile

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-03 01:44:24 +03:30