There is a website called https://www.guidgenerator.com/online-guid-generator.aspx which generates globally unique identifiers. I'm trying to use perl's Mechanize to publish to a website to extract the guide. I realize this is javascript based but was wondering if I could make the right post to pull the numbers. I traced it from the browser and got all the headers in the request but the html returned does not contain the guid.
This is the result of a successful run:
<textarea name="txtResults" rows="2" cols="20" id="txtResults" style="font-family:Courier New,Courier,monospace;font-size:Larger;font-weight: bold;Height: 152px; Width: 421px;">qk5DF22bhkm4C2AwZ5OcZw==</textarea>
This is my script:
<textarea name="txtResults" rows="2" cols="20" id="txtResults" style="font-family:Courier New,Courier,monospace;font-size:Larger;font-weight: bold;Height: 152px; Width: 421px;"></textarea>
This is the form within the page:
In my script I dumped the following required form and input fields:
my @forms = $mech->forms; foreach my $form (@forms) { my @inputfields = $form->param; print Dumper \@inputfields; }
result
$VAR1 = [ '__EVENTTARGET', '__EVENTARGUMENT', '__LASTFOCUS', '__VIEWSTATE', '__VIEWSTATEGENERATOR', '__EVENTVALIDATION', 'txtCount', 'chkUppercase', 'chkBrackets', 'chkHypens', 'chkBase64', 'chkRFC7515', 'chkURL', 'LocalTimestampValue', 'btnGenerate', 'txtResults' ];
This is the post
my $mainpage = "https://www.guidgenerator.com/online-guid-generator.aspx"; $mech->post( "$mainpage", fields => { 'txtCount' => "1", 'chkBase64' => "on", 'LocalTimestampValue' => "Date%28%29.getTime%28%29", 'btnGenerate' => "Generate+some+GUIDs%21", 'txtResults' => "", '__EVENTTARGET' => 'on', '__EVENTARGUMENT', => 'on', '__LASTFOCUS', => 'on', '__VIEWSTATEGENERATOR' => "247C709F", '__VIEWSTATE' => 'on', '__EVENTVALIDATION' => 'on', 'chkUppercase' => 'off', 'chkBrackets' => 'off', 'chkHypens' => 'off', 'chkRFC7515' => 'off', 'chkURL' => 'off', }, );
When I trace on the website, I get the headers, but there is another tab called "Payload". This contains most of the fields listed above. I tried entering the fields into a POST but not sure if I should do this differently or it doesn't matter since it's javascript?
I know this is a lot of information. I'm not even sure Perl's mechanization can extract this information. Any help would be greatly appreciated. Please let me know any other data you'd like me to post here.
P粉7148900532024-04-03 09:56:29
You can use Mech's built-in functionality to do this. No need to submit any additional fields or headers.
use strict;
use warnings;
use feature 'say';
use WWW::Mechanize;
my $mech = WWW::Mechanize->new;
$mech->get('https://www.guidgenerator.com/online-guid-generator.aspx');
$mech->field( txtCount => 10 );
$mech->click;
say $mech->value('txtResults');
This will output the following:
$ perl mech.pl 211b3cad1665483ca303360bdbda0c61 ecc3348d83cb4bb5bdcb11c6148c5ae1 0a3f2fe5748946a1888a4a5bde8ef2e6 acb26deb9fda4411aa64638cdd1ec5f1 2afe609c355b4a10b6a0ae8c74d3aef1 30fd89ab170147cfb24f131346a203e3 2301d258e1d045aa8f0682f2ea14464c f064507ca3e14a4eb860b0a30ba096ed 9a42b15d5c79420c921dcc07c306459b 5bea2e345f75453caaf795681963866a
The key here is that you cannot use $mech- >submit
as this will not submit the value of the submit button. This is a bit annoying. So you have to use $mech->click
, which pretends that the default submit button of the default form is clicked, so the value is also submitted. This is how buttons work on a form, in this case the backend checks the values to see which one was clicked.
You can then use $mech->value
to get the field value. You may want to split
to process it further.
The JavaScript in this page is actually completely unrelated to functionality. All it does is save and restore the settings you selected in the cookie so that when you come back, the same checkboxes will be checked. This is fine, but it might be better to use local storage on the frontend for now. However, you don't need to deal with JS at all to scrape this page. The main functionality is the backend.
You may also be interested in $mech->dump_forms
, which is a great debugging aid that prints out all forms with fields and values. Another great debugging aid when using Mech (or any LWP-based class) is LWP::ConsoleLogger::Everywhere. This is what I use to compare the program's requests with the browser's requests to find the missing button form fields.
Disclaimer: I am the maintainer of WWW::Mechanize and I wrote LWP::ConsoleLogger::Everywhere.