linux_wiki:amazon_product_page_checker

Amazon Product Page Checker

General Information

Automatically check an Amazon product page for certain information. (Availability, price, etc)

Checklist


1) In a web browser, visit the Amazon.com product page you want to auto check and save the URL.

2) Copy your user agent string. (To trick the web servers later on)

You can find it by visiting: http://www.useragentstring.com/

Alternatively, the user agent string can be found in Firefox by:

  • Clicking the Open Menu button (three lines, upper right)
  • Click the Developer button, then Network
  • Ensure the “Network” tab is selected and reload the page
  • Click on any of the files that have been successfully loaded
    • On the right, a new pane opens up, with the “Headers” tab open
    • Scroll down, under “Request Headers”, find “User-Agent”
    • Select the string to the right of “User-Agent” and copy/paste it somewhere else for later use.

1) Using the collected product page URL and user agent string, test a curl against the page.

In this example, I am using a Firefox UA string and checking a PS4 bundle page:

curl -sA "Mozilla/5.0 (X11; Ubuntu; Linux x86_64; rv:33.0) Gecko/20100101 Firefox/33.0" http://www.amazon.com/gp/product/B00O9JLBOC
  • The (-s) option makes curl silent and not show download progress, which just gets in the way
  • The (-A) option is for specifying user agent string. If you do NOT send a user agent string, Amazon displays an error message in the returned page.
  • The last field is the product page URL.

2) Decide what piece of information from the page you want to watch and test grep.

At the time of the script that I wrote, that PS4 bundle page had a nice big banner on it that said “Sign up to be notified when this item becomes available”. I wanted to receive an e-mail as soon as this text disappeared from the page. (I did not trust that Amazon would actually e-mail me as soon as it was available to buy)

curl -sA "Mozilla/5.0 (X11; Ubuntu; Linux x86_64; rv:33.0) Gecko/20100101 Firefox/33.0" http://www.amazon.com/gp/product/B00O9JLBOC | grep -o "Sign up to be notified when this item becomes available"
  • Grep's (-o) option, only displays the found text, and nothing else.
    • Using this logic, if the item became available to buy, that statement would not return any result.

Putting the above curl statement into a working script looks like this:

ps4_checker.sh
#!/bin/bash
 
RESPONSE=$(curl -sA "Mozilla/5.0 (X11; Ubuntu; Linux x86_64; rv:33.0) Gecko/20100101 Firefox/33.0" http://www.amazon.com/gp/product/B00O9JLBOC | grep -io "Sign up to be notified when this item becomes available")
 
if [ "$RESPONSE" != "Sign up to be notified when this item becomes available" ]; then
 /usr/bin/mail -s "PS4 Bundle Available Now!" myname@gmail.com < /home/bill/bin/ps4amazonalert.txt
 crontab -l | sed '/^[^#].*\/home\/bill\/bin\/ps4_checker.sh/s/^/#/' | crontab -
fi
  • RESPONSE: a variable that holds the result of the curl. (either the grep search string or nothing)
  • if statement : If the grep search string is not on the page, e-mail me and let me know the product is available for purchase!
  • /usr/bin/mail : send an e-mail to myself using a pre-written message
  • crontab -l line: Disable the crontab entry by commenting it out. Since this script is launched at a set interval with cron, once the product is available, I would get an e-mail every so often unless the cron entry was disabled.
    • crontab -l : list the crontab entries
    • sed /^[^#].*\/home\/bill\/bin\/ps4_checker.sh : Match the beginning of a line (^) that is NOT commented ([^#]), for any number of characters (.*), then /home/bill/bin/ps4_checker.sh
    • s/^/#/' : swap the beginning of that matched sed line with a comment
    • crontab - : pipe that output line into crontab as input, thus editing the crontab entry
  • Do nothing else if the string is still there.

Don't forget create the cron entry to have the script execute every 15 mins or so.

crontab -e
# minute (m), hour (h), day of month (dom), month (mon),
# and day of week (dow) or use '*' in these fields (for 'any'). 
# 
# m h  dom mon dow   command
*/15 * * * * /home/bill/bin/ps4_checker.sh
  • linux_wiki/amazon_product_page_checker.txt
  • Last modified: 2019/05/25 23:50
  • (external edit)