Bash: Checking if a website has changed via terminal

Terminalicon

Bash scripting yay!

I have few sites that I like, who doesn’t get updated content regularly. I have a habit of keeping these pages open in a tab in my browser. So I created a simple bash script that lets me display whether that site has changed on terminal.

 

Requirements: common unix-tools like: bash, curl, md5sum…

Why?

For me the main reason is speed. It’s simply faster to check changes in code level than render everything graphically. It also saves bandwidth. In theory the changing content (banners and ads) shouldn’t disrupt the code as those are fetched elsewhere and do not affect the code. Note that this script does not work on a sites requiring a login. It also may have bugs and nasties and may break the Internet, but for now, on my test cases it works.

Our script

First we need a file for our script. I call it check.sh. set that as executable and insert the code below:

#!/bin/bash
# -- a script to get md5 from a website
# variables
pageTitle="Title of the site"
page="$(curl http://www.LOCATION/OF/PAGE/)"
dateFormat=`date +"%m-%d-%Y:%H:%M:%S"`
# md5pipe
md5get=`echo $page | md5sum | awk '{print $1}'`
# display
echo "$pageTitle@$dateFormat // $md5get" >> logfile.txt
echo "== TAIL =="
tail logfile.txt

So what are the lines for?

In the first three lines we define some variables. pageTitle is simply there to help us remember the site we are checking, the page contains the page downloaded with curl. dateFormat is the format we insert timestamps.

the fourth line of code pipes the contents of that url into md5sum which is then refined by awk. With this line we get the md5 checksum of the site, and it is different every time the site changes.

Then we insert(append) the checksum, title and timestamp into a file called logfile.txt. And finally we display the logfile so that we see whether there has been changes.

Future plans?

While it would be easy to expand this to various directions, I probably won’t. Tere were some thing I did consider:

  • make it a cronjob. — This would automate the check process.
  • support multiple urls. — While this would be nice, I just copied the script under different name and changed few variables as that was simpler for my needs.
  • Make the logfile so that it displays the url in a way that Terminology (Enlightenments terminal) opens them. — This is something I may do once I get more familiar with Terminology.

 

by DeusIX Posted in Programming, Technology | Comments (0) (, )

No comments yet

Leave a Reply

You must be logged in to post a comment.