Ive been building my own homelab setup for a few years now and usually Im pretty good at snagging parts when they drop but lately everything has gone sideways. I used to just rely on some basic Chrome extensions or maybe a simple python script with BeautifulSoup to scrape pages for in stock text but these retail sites have gotten so much more aggressive with their bot detection and dynamic rendering. Its like every time I find a good price on a specific rack-mount chassis or even some Noctua fans here in the UK, the page layout changes or they throw up a cloudflare wall that my old scripts just cant handle anymore.
I really need to find a more robust way to manage these alerts across like ten different shops without having forty tabs open constantly draining my RAM. Im looking for something that maybe uses a headless browser approach or has some kind of cloud-based monitoring so I dont have to keep my PC running 24/7. My budget is probably around 15 quid a month if the service is actually reliable and wont get my IP banned. Do you guys use any specific platforms or self-hosted tools that are actually good at handling sites with heavy javascript or weird DOM structures? Most of the free stuff Ive tried lately just pings me with false positives every time a recommended products slider updates...
Late to the party here, but @Reply #2 - good point! Scraping DOM is a total headache. I actually moved away from self-hosting scripts like the ones mentioned because my home IP got flagged so fast it wasnt even funny. Spent ages last month trying to snag some Noctua fans too... Visualping ended up being my choice because it handles the cloud-side stuff without needing a complex proxy setup or a dedicated server. For me, the managed route just works better:
Saw this yesterday and yeah, it is a bit of a nightmare lately. Honestly, I feel your pain. Spending way too much time building custom scrapers only to have them break because a site decided to tweak their CSS classes or move a button inside a shadow DOM is the worst. Unfortunately, even the headless browser stuff people recommend isnt always the magic bullet it sounds like. Running Puppeteer on a local server was my go-to for a while but keeping up with the constant cat-and-mouse game with Cloudflare is just exhausting. Had a lot of issues with several self-hosted setups getting blocked immediately because a standard home IP looks suspicious to their WAF rules once you start hitting them regularly. If you are willing to spend some of that budget, I have had decent luck with Distill.io recently. Its not as good as it used to be back in the day and the cloud pricing tiers are kinda annoying, but it is still way better at handling heavy Javascript than most. You can set it to monitor specific sub-elements which helps avoid those stupid false positives from recommended product sliders that ruin your sleep. Just dont set the crawl frequency too high or you will get flagged anyway. Ngl, it is kinda disappointing that we even have to go this far just to buy a rack-mount chassis or some fans, but thats the state of the web right now. Sometimes I just give up and join a stock tracker Discord because letting someone else handle the scraping headache is worth the peace of mind.