I am using HTTPRequest and then HTTPGetResult to capture the html used on a page.
I now want to extracts data from this page.
I only need a few pieces of data which are in a table ...
Is there an example of how to do this or ...
Can some one tell me what functions I should be looking at to figure out how to extract the needed data ...
I am trying to extract ... the part number, manufacturer, quantity, and price
The code I need to work with starts in a <tr> tag and then is followed with
<td class="middesc"> that is how I know where the record starts and then the next record starts.
Is there a way to say locate a specific tag then extract XX digits to the right or until reach the closing tag? .... then find the next <TAG> and extract XX digits to the right ... etc ...
or ... find all the information between the opening an closing tag labeled...
Here is a piece of the html code that I am trying extract data from (there are 7 records) ... (there is a bunch of code above this but nothing I need and I don't think it is relevant ...
<tr>
<!--<td class="listcell"><input type="checkbox" name="comp1"></td>-->
<td class="listcell" style="color: #cccccc; width: 104px; text-align: center;">
<div style="height: 80px; width: 104px; overflow: hidden;">
<A HREF="/scripts/cgiip.exe/wa/wcat/itemdtl.r?listtype=Catalog&pnum=FZ1600R12KE3-EUPC">
<img src="/images/catalog/picture-na_s.jpg" alt="FZ1600R12KE3 - more info" border="0" style="font-size: 11px;"width="100px"><br> </a>
</div>
</td>
<td class="listcell desc">
<div class="proddesc">
<table style="height: 78px;">
<tr><td class="topdesc">
<A HREF="/scripts/cgiip.exe/wa/wcat/itemdtl.r?listtype=Catalog&pnum=FZ1600R12KE3-EUPC">
EUPEC<br>TRANSISTOR </a>
</td></tr>
<tr><td class="middesc">IGBT 1600A 1200V SINGLE</td></tr>
<tr><td class="botdesc"><span class="bold">ITEM # </span><A HREF="/scripts/cgiip.exe/wa/wcat/itemdtl.r?listtype=Catalog&pnum=FZ1600R12KE3-EUPC">FZ1600R12KE3</A></td></tr> </table>
</div> <!-- proddesc -->
</td>
<td class="listcell" align="center">
<div class="stockstat">NO STOCK</div>
<div class="shipmsg">Est. Lead Time<br>28 days</div>
</td>
<td class="listcell" style="text-align: center; vertical-align: top;">
<div style="padding: 21px 8px 6px 8px; font-weight: bold;">
$1,859.60 </div>
<div class="volmsg">Volume<br>Discounts<br>Available<br></div> </td>
<td class="listcell" align="center">
<!--
<a href="/scripts/cgiip.exe/wa/wcat/shopcart.r?listtype=Catalog&pnum=FZ1600R12KE3-EUPC&mfgr=EUPEC" style="color: #ff0033; font-weight: 700;">ADD to CART</a>
-->
<span class="bold">QTY. </span>
<input type="text" maxlength="8" value="1" class="descr" style="width: 35px; text-align: right;" name="part_1">
<br><input type="image" value="Submit" src="/images/buttons/r-addtocart2.gif"
onClick="return addToCart(part_1.value,this.form,'FZ1600R12KE3-EUPC');"
name="Add to Cart" id="add_1" style="position: relative; top: 4px;">
</td>
</tr>
<tr>
<!--<td class="listcell"><input type="checkbox" name="comp2"></td>-->
<td class="listcell" style="color: #cccccc; width: 104px; text-align: center;">
<div style="height: 80px; width: 104px; overflow: hidden;">
<A HREF="/scripts/cgiip.exe/wa/wcat/itemdtl.r?listtype=Catalog&pnum=FZ1600R12KF4-EUPC">
<img src="/images/catalog/picture-na_s.jpg" alt="FZ1600R12KF4 - more info" border="0" style="font-size: 11px;"width="100px"><br> </a>
</div>
</td>
<td class="listcell desc">
<div class="proddesc">
<table style="height: 78px;">
<tr><td class="topdesc">
<A HREF="/scripts/cgiip.exe/wa/wcat/itemdtl.r?listtype=Catalog&pnum=FZ1600R12KF4-EUPC">
EUPEC<br>TRANSISTOR </a>
</td></tr>
<tr><td class="middesc">IGBT 1600A 1200V SINGLE</td></tr>
<tr><td class="botdesc"><span class="bold">ITEM # </span><A HREF="/scripts/cgiip.exe/wa/wcat/itemdtl.r?listtype=Catalog&pnum=FZ1600R12KF4-EUPC">FZ1600R12KF4</A></td></tr> </table>
</div> <!-- proddesc -->
</td>
<td class="listcell" align="center">
<div class="stockstat">NO STOCK</div>
<div class="shipmsg">Est. Lead Time<br>27 days</div>
</td>
<td class="listcell" style="text-align: center; vertical-align: top;">
<div style="padding: 21px 8px 6px 8px; font-weight: bold;">
$2,208.28 </div>
<div class="volmsg">Volume<br>Discounts<br>Available<br></div> </td>
<td class="listcell" align="center">
<!--
<a href="/scripts/cgiip.exe/wa/wcat/shopcart.r?listtype=Catalog&pnum=FZ1600R12KF4-EUPC&mfgr=EUPEC" style="color: #ff0033; font-weight: 700;">ADD to CART</a>
-->
<span class="bold">QTY. </span>
<input type="text" maxlength="8" value="1" class="descr" style="width: 35px; text-align: right;" name="part_2">
<br><input type="image" value="Submit" src="/images/buttons/r-addtocart2.gif"
onClick="return addToCart(part_2.value,this.form,'FZ1600R12KF4-EUPC');"
name="Add to Cart" id="add_2" style="position: relative; top: 4px;">
</td>
</tr>
<tr>
<!--<td class="listcell"><input type="checkbox" name="comp3"></td>-->
<td class="listcell" style="color: #cccccc; width: 104px; text-align: center;">
<div style="height: 80px; width: 104px; overflow: hidden;">
<A HREF="/scripts/cgiip.exe/wa/wcat/itemdtl.r?listtype=Catalog&pnum=FZ1600R12KL4C-EUPC">
<img src="/images/catalog/picture-na_s.jpg" alt="FZ1600R12KL4C - more info" border="0" style="font-size: 11px;"width="100px"><br> </a>
</div>
</td>
<td class="listcell desc">
<div class="proddesc">
<table style="height: 78px;">
<tr><td class="topdesc">
<A HREF="/scripts/cgiip.exe/wa/wcat/itemdtl.r?listtype=Catalog&pnum=FZ1600R12KL4C-EUPC">
EUPEC<br>TRANSISTOR </a>
</td></tr>
<tr><td class="middesc">IGBT 1600A 1200V SINGLE</td></tr>
<tr><td class="botdesc"><span class="bold">ITEM # </span><A HREF="/scripts/cgiip.exe/wa/wcat/itemdtl.r?listtype=Catalog&pnum=FZ1600R12KL4C-EUPC">FZ1600R12KL4C</A></td></tr> </table>
</div> <!-- proddesc -->
</td>
<td class="listcell" align="center">
<div class="stockstat">NO STOCK</div>
<div class="shipmsg">Est. Lead Time<br>28 days</div>
</td>
<td class="listcell" style="text-align: center; vertical-align: top;">
<div style="padding: 21px 8px 6px 8px; font-weight: bold;">
$2,208.28 </div>
<div class="volmsg">Volume<br>Discounts<br>Available<br></div> </td>
<td class="listcell" align="center">
<!--
<a href="/scripts/cgiip.exe/wa/wcat/shopcart.r?listtype=Catalog&pnum=FZ1600R12KL4C-EUPC&mfgr=EUPEC" style="color: #ff0033; font-weight: 700;">ADD to CART</a>
-->
<span class="bold">QTY. </span>
<input type="text" maxlength="8" value="1" class="descr" style="width: 35px; text-align: right;" name="part_3">
<br><input type="image" value="Submit" src="/images/buttons/r-addtocart2.gif"
onClick="return addToCart(part_3.value,this.form,'FZ1600R12KL4C-EUPC');"
name="Add to Cart" id="add_3" style="position: relative; top: 4px;">
</td>
</tr>
<tr>
<!--<td class="listcell"><input type="checkbox" name="comp4"></td>-->
<td class="listcell" style="color: #cccccc; width: 104px; text-align: center;">
<div style="height: 80px; width: 104px; overflow: hidden;">
<A HREF="/scripts/cgiip.exe/wa/wcat/itemdtl.r?listtype=Catalog&pnum=FZ1600R17KE3-B2-EUPC">
<img src="/images/catalog/picture-na_s.jpg" alt="FZ1600R17KE3-B2 - more info" border="0" style="font-size: 11px;"width="100px"><br> </a>
</div>
</td>
<td class="listcell desc">
<div class="proddesc">
<table style="height: 78px;">
<tr><td class="topdesc">
<A HREF="/scripts/cgiip.exe/wa/wcat/itemdtl.r?listtype=Catalog&pnum=FZ1600R17KE3-B2-EUPC">
EUPEC<br>TRANSISTOR </a>
</td></tr>
<tr><td class="middesc">IGBT SINGLE</td></tr>
<tr><td class="botdesc"><span class="bold">ITEM # </span><A HREF="/scripts/cgiip.exe/wa/wcat/itemdtl.r?listtype=Catalog&pnum=FZ1600R17KE3-B2-EUPC">FZ1600R17KE3-B2</A></td></tr> </table>
</div> <!-- proddesc -->
</td>
<td class="listcell" align="center">
<div class="stockstat">NO STOCK</div>
<div class="shipmsg">Est. Lead Time<br>28 days</div>
</td>
<td class="listcell" style="text-align: center; vertical-align: top;">
<div style="padding: 21px 8px 6px 8px; font-weight: bold;">
$2,862.80 </div>
<div class="volmsg">Volume<br>Discounts<br>Available<br></div> </td>
<td class="listcell" align="center">
<!--
<a href="/scripts/cgiip.exe/wa/wcat/shopcart.r?listtype=Catalog&pnum=FZ1600R17KE3-B2-EUPC&mfgr=EUPEC" style="color: #ff0033; font-weight: 700;">ADD to CART</a>
-->
<span class="bold">QTY. </span>
<input type="text" maxlength="8" value="1" class="descr" style="width: 35px; text-align: right;" name="part_4">
<br><input type="image" value="Submit" src="/images/buttons/r-addtocart2.gif"
onClick="return addToCart(part_4.value,this.form,'FZ1600R17KE3-B2-EUPC');"
name="Add to Cart" id="add_4" style="position: relative; top: 4px;">
</td>
</tr>
<tr><!--<td class="listcell"><input type="checkbox" name="comp5"></td>-->
<td class="listcell" style="color: #cccccc; width: 104px; text-align: center;">
<div style="height: 80px; width: 104px; overflow: hidden;">
<A HREF="/scripts/cgiip.exe/wa/wcat/itemdtl.r?listtype=Catalog&pnum=FZ1600R17KE3-EUPC">
<img src="/images/catalog/picture-na_s.jpg" alt="FZ1600R17KE3 - more info" border="0" style="font-size: 11px;"width="100px"><br> </a>
</div>
</td>
<td class="listcell desc">
<div class="proddesc">
<table style="height: 78px;">
<tr><td class="topdesc">
<A HREF="/scripts/cgiip.exe/wa/wcat/itemdtl.r?listtype=Catalog&pnum=FZ1600R17KE3-EUPC">
EUPEC<br>TRANSISTOR </a>
</td></tr>
<tr><td class="middesc">IGBT SINGLE</td></tr>
<tr><td class="botdesc"><span class="bold">ITEM # </span><A HREF="/scripts/cgiip.exe/wa/wcat/itemdtl.r?listtype=Catalog&pnum=FZ1600R17KE3-EUPC">FZ1600R17KE3</A></td></tr> </table>
</div> <!-- proddesc -->
</td>
<td class="listcell" align="center">
<div class="stockstat">NO STOCK</div>
<div class="shipmsg">Est. Lead Time<br>28 days</div>
</td>
<td class="listcell" style="text-align: center; vertical-align: top;">
<div style="padding: 21px 8px 6px 8px; font-weight: bold;">
$2,208.28 </div>
<div class="volmsg">Volume<br>Discounts<br>Available<br></div> </td>
<td class="listcell" align="center">
<!--
<a href="/scripts/cgiip.exe/wa/wcat/shopcart.r?listtype=Catalog&pnum=FZ1600R17KE3-EUPC&mfgr=EUPEC" style="color: #ff0033; font-weight: 700;">ADD to CART</a>
-->
<span class="bold">QTY. </span>
<input type="text" maxlength="8" value="1" class="descr" style="width: 35px; text-align: right;" name="part_5">
<br><input type="image" value="Submit" src="/images/buttons/r-addtocart2.gif"
onClick="return addToCart(part_5.value,this.form,'FZ1600R17KE3-EUPC');"
name="Add to Cart" id="add_5" style="position: relative; top: 4px;">
</td>
</tr>
<tr>
<!--<td class="listcell"><input type="checkbox" name="comp6"></td>-->
<td class="listcell" style="color: #cccccc; width: 104px; text-align: center;">
<div style="height: 80px; width: 104px; overflow: hidden;">
<A HREF="/scripts/cgiip.exe/wa/wcat/itemdtl.r?listtype=Catalog&pnum=FZ1600R17KF6-B2-EUPC">
<img src="/images/catalog/picture-na_s.jpg" alt="FZ1600R17KF6-B2 - more info" border="0" style="font-size: 11px;"width="100px"><br> </a>
</div>
</td>
<td class="listcell desc">
<div class="proddesc">
<table style="height: 78px;">
<tr><td class="topdesc">
<A HREF="/scripts/cgiip.exe/wa/wcat/itemdtl.r?listtype=Catalog&pnum=FZ1600R17KF6-B2-EUPC">
EUPEC<br>TRANSISTOR </a>
</td></tr>
<tr><td class="middesc">IGBT 1600A 1700V SINGLE</td></tr>
<tr><td class="botdesc"><span class="bold">ITEM # </span><A HREF="/scripts/cgiip.exe/wa/wcat/itemdtl.r?listtype=Catalog&pnum=FZ1600R17KF6-B2-EUPC">FZ1600R17KF6-B2</A></td></tr> </table>
</div> <!-- proddesc -->
</td>
<td class="listcell" align="center">
<div class="stockstat">NO STOCK</div>
<div class="shipmsg">Est. Lead Time<br>28 days</div>
</td>
<td class="listcell" align="center" style="color: #cccccc;">•</td>
<td class="listcell" align="center">
<!--
<a href="/scripts/cgiip.exe/wa/wcat/shopcart.r?listtype=Catalog&pnum=FZ1600R17KF6-B2-EUPC&mfgr=EUPEC" style="color: #ff0033; font-weight: 700;">ADD to CART</a>
-->
<span class="bold">QTY. </span>
<input type="text" maxlength="8" value="1" class="descr" style="width: 35px; text-align: right;" name="part_6">
<br><input type="image" value="Submit" src="/images/buttons/r-addtocart2.gif"
onClick="return addToCart(part_6.value,this.form,'FZ1600R17KF6-B2-EUPC');"
name="Add to Cart" id="add_6" style="position: relative; top: 4px;">
</td>
</tr>
<tr>
<!--<td class="listcell"><input type="checkbox" name="comp7"></td>-->
<td class="listcell" style="color: #cccccc; width: 104px; text-align: center;">
<div style="height: 80px; width: 104px; overflow: hidden;">
<A HREF="/scripts/cgiip.exe/wa/wcat/itemdtl.r?listtype=Catalog&pnum=FZ1600R17KF6C-B2-EUPC">
<img src="/images/catalog/picture-na_s.jpg" alt="FZ1600R17KF6C-B2 - more info" border="0" style="font-size: 11px;"width="100px"><br> </a>
</div>
</td>
<td class="listcell desc">
<div class="proddesc">
<table style="height: 78px;">
<tr><td class="topdesc">
<A HREF="/scripts/cgiip.exe/wa/wcat/itemdtl.r?listtype=Catalog&pnum=FZ1600R17KF6C-B2-EUPC">
EUPEC<br>TRANSISTOR </a>
</td></tr>
<tr><td class="middesc">IGBT</td></tr>
<tr><td class="botdesc"><span class="bold">ITEM # </span><A HREF="/scripts/cgiip.exe/wa/wcat/itemdtl.r?listtype=Catalog&pnum=FZ1600R17KF6C-B2-EUPC">FZ1600R17KF6C-B2</A></td></tr> </table>
</div> <!-- proddesc -->
</td>
<td class="listcell" align="center">
<div class="stockstat">NO STOCK</div>
<div class="shipmsg">Est. Lead Time<br>28 days</div>
</td>
<td class="listcell" style="text-align: center; vertical-align: top;">
<div style="padding: 21px 8px 6px 8px; font-weight: bold;">
$2,933.14 </div>
<div class="volmsg">Volume<br>Discounts<br>Available<br></div> </td>
<td class="listcell" align="center">
<!--
<a href="/scripts/cgiip.exe/wa/wcat/shopcart.r?listtype=Catalog&pnum=FZ1600R17KF6C-B2-EUPC&mfgr=EUPEC" style="color: #ff0033; font-weight: 700;">ADD to CART</a>
-->
<span class="bold">QTY. </span>
<input type="text" maxlength="8" value="1" class="descr" style="width: 35px; text-align: right;" name="part_7">
<br><input type="image" value="Submit" src="/images/buttons/r-addtocart2.gif"
onClick="return addToCart(part_7.value,this.form,'FZ1600R17KF6C-B2-EUPC');"
name="Add to Cart" id="add_7" style="position: relative; top: 4px;">
</td>
</tr>
</table>
<table width="98%" border="0"><tr><td width="20%" align="center" class="small"><i>1 - 7 of 7 Matches</i> </td><td width="22%" class="small"> </td><td width="58%" align="right" class="small">
</td> </tr> </table>
</form>
</div>
</td></tr></table>
<div class="gvvtext" style="text-align: right;"><br>
</div>
</div>
</div> <!-- mainbody -->
</div> <!-- core -->
<div class="tfoot">
<!-- tfoot.inc -->
<div class="spacer"></div>
<div class="orangebot">
<div class="footl">
<span id="siteseal"><script type="text/javascript" src="[
seal.godaddy.com];
</div>
<div class="footc">
<div class="small bold">26010 Pinehurst Drive, Madison Heights, MI 48071</div>
<div class="small pad">
<a class="tlnk" href="/fabout.htm">About Us</a> | © Copyright 2009 Galco Industrial Electronics, All Rights Reserved | <a class="tlnk" href="/terms.htm">Terms of Use</a>
</div>
</div>
<div class="footr">
<!-- START SCANALERT CODE -->
<a target="_blank" href="[
www.mcafeesecure.com] width="94" height="54" border="0" src="//images.scanalert.com/meter/www.galco.com/23.gif" alt="McAfee Secure sites help keep you safe from identity theft, credit card fraud, spyware, spam, viruses and online scams" oncontextmenu="alert('Copying Prohibited by Law - McAfee Secure is a Trademark of McAfee, Inc.'); return false;"></a>
<!-- END SCANALERT CODE -->
</div>
</div>
<script type="text/javascript">
var gaJsHost = (("https:" == document.location.protocol) ? "[
ssl."]; : "[
www."]
;
document.write(unescape("%3Cscript src='" + gaJsHost + "google-analytics.com/ga.js' type='text/javascript'%3E%3C/script%3E"));
</script>
<script type="text/javascript">
var pageTracker = _gat._getTracker("UA-709262-1");
pageTracker._initData();
pageTracker._trackPageview();
</script>
<!-- End of tfoot.inc -->
</div> <!-- tfoot -->
</div> <!-- content -->
</body>
</html>