Add scrape/ingestion engine + validation, and 24h shift hour-range visualization
Scrape engine (Services/Scraping/): pluggable IListingSource (working sample + Telegram/Divar credential-ready stubs) → IngestionService (content-hash dedupe → parse → validate → review queue) → ListingValidator (completeness score + spam screen) → IngestionWorker (config-gated hosted service). RawListing gains ContentHash/Confidence/ValidationNotes; RawListingStatus.Flagged. Admin /Admin gets run-now, source list, confidence + flagged queue. Hour-range viz: _HourBar 24h timeline bar (colored by type, overnight wrap) on shift cards, recommendation cards, and detail. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
This commit is contained in:
@@ -0,0 +1,15 @@
|
||||
namespace JobsMedical.Web.Services.Scraping;
|
||||
|
||||
/// <summary>One raw post pulled from a source (a Telegram message, a Divar ad, etc.).</summary>
|
||||
public record ScrapedItem(string Source, string RawText, string? SourceUrl = null);
|
||||
|
||||
/// <summary>
|
||||
/// A pluggable source the ingestion engine pulls from. Implement once per channel/site.
|
||||
/// `Enabled` lets a source be present but dormant until it's configured with credentials.
|
||||
/// </summary>
|
||||
public interface IListingSource
|
||||
{
|
||||
string Name { get; }
|
||||
bool Enabled { get; }
|
||||
Task<IReadOnlyList<ScrapedItem>> FetchAsync(CancellationToken ct = default);
|
||||
}
|
||||
Reference in New Issue
Block a user