Javascript Postback Download via Powershell

Recently, one of my clients had a need to automatically download a file from a public facing state government website.  Normally this can easily be done in a manner of ways.  Powershell is the first that comes to mind but you could also utilize scripting tools such as wget or curl just to name a couple.  However, thanks to the awesome power (note: sarcasm) of Dot Net Nuke, the download link is hidden behind javascript postback functionality.

Essentially, a postback is where a web page contains a form that consumes data.  This consumable data can be text fields or even a button/link click.  When the form is submitted, the data from the form is then sent back to the same page that the form originated from. This is the “postback” of the process.  The form “posts” data back to itself and returns the appropriate results.  In this case, the result is a downloadable file.

Disclaimer: I’m not a Powershell expert by any means.  Also, the script requires Powershell 3.0 or higher due to some of the cmdlets used.

The Script

In a nutshell, the script has to perform in the following manner:

  1.  Obtain the URL from a variable
  2.  Get some session information so that we can “fake” out the server
  3.  Add a user site and a form
  4.  Create fields on the form and set them to the appropriate values
  5.  Specify a file name
  6.  Send the form back to the server

So, let’s break this a part and step through it:

Of course, this could be turned into a script that accepts variables directly.  In this case, however, I don’t need to do this.  The variable is statically set.

#URL that needs to be fetched
$url = "https://site.state.gov/default.aspx"

#get the server name in case the process jumps to another script 
$serverName = $env:computername

While I am here I also retrieve the server name.  This is done so that if the script is going to be moved to another server, the process itself shouldn’t break.  I tried to have a little forward thinking here.

Next I’ll use the Invoke-WebRequest with the $url variable along with the -SessionVariable switch.  This switch will create a web request session object and assign it to the specified variable, called “session”.  Also note that I am putting things into a TRY/CATCH block as I want to make sure something happens if things go south during this process.

TRY {
#use invoke-webrequest to fetch a session from the site
Invoke-WebRequest $url -SessionVariable session -UseBasicParsing

Now we’ll call the Invoke-WebRequest cmdlet again against the same URL that was used originally.   This will allow us to obtain the form from the page, which contains

#add a site using the session information from the above web request
$addUserSite = Invoke-WebRequest $url -WebSession $session #get the website $url using the session contained in $session 
$addUserForm = $addUserSite.Forms[0] #Invoke-WebRequest does a lot of auto processing.

With dealing with a postback process, there are usually certain fields on the form that are critical to the process.  In this case, there were two fields that I needed:

  • __EVENTTARGET
  • __EVENTARGUMENT

Depending on the form and the requirements, the javascript could be expecting more or less fields so your mileage will vary.  I used Fiddler to help identify what fields were needed.  I’ll blog about Fiddler in an upcoming post.  You can also determine what is needed from the page source, you just have to go looking for it.

Now that I know what fields I needed, I can continue with the script

#add form fields for the event target & argument that is needed to actually do the post back
$addUserForm.Fields["__EVENTTARGET"] = "dnn`$abcd1234`$File`$ExcelFile" 
$addUserForm.Fields["__EVENTARGUMENT"] = ""

I need to be able to download the file, so I need a file name:

#where are we saving the file & what is file name
$filename = "C:\temp\Download_FIle_Name.txt"

Now that all things are set, we do another Invoke-WebRequest and send everything back to the page.  I’ve also included the “-Outfile” parameter here for the cmdlet and passed in the $fileName value that I set in the step above.  The cmdlet will then download the file to the specified directory and file name.  I also end the TRY block here.

#invoke another web request using the same url, session information, and out put to the $filename variable
Invoke-WebRequest -uri $url -method post -Body $addUserForm.Fields -WebSession $session -UseBasicParsing -Outfile $fileName
}

Finally, I start the CATCH block for the TRY/CATCH.  For this client, the best method was an email.  An email will be sent to their Help Desk provider which generates an automatic help desk ticket for the IT staff to investigate.

# catch any errors from above and send an email to the right people
CATCH {
Send-MailMessage -To "it@somecompany.com" -From "donotrely@somecompany.com" -Subject "Some important thing just happened" -SmtpServer "server.smtp.com" -Body "Check stuff out on $serverName"
}

The completed script could now be scheduled based on the clients needs.  I put the script on a server and created a scheduled task within Windows.  You could also put this into a SQL Agent job if needed as long as the Powershell version of the SQL Server is sufficient.

Summary

Handling the postback proved to be more complicate than I had originally thought and it took trial and error.  I had tried some other methods as well such as wget and curl but I found that using Powershell eventually was the better solution.  Now the client has an automated way to download the file for their needs without manual intervention.

You can get the full script from here.

Enjoy!

 

 

© 2018, John Morehouse. All rights reserved.

Share

Leave a Reply

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.

Trust DCAC with your data

Your data systems may be treading water today, but are they prepared for the next phase of your business growth?