Featured post

c# - Usage of Server Side Controls in MVC Frame work -

i using asp.net 4.0 , mvc 2.0 web application. project requiremrnt have use server side control in application not possibl in noraml case. ideally want use adrotator control , datalist control. i saw few samples , references in codepleax mvc controllib howwver found less useful. can tell how utilize theese controls in asp.net application along mvc. note: please provide functionalities related adrotator , datalist controls not equivalent functionalities thanks in advace. mvc pages not use normal .net solution makes use of normal .net components impossible. a normal .net page use event driven solution call different methods service side mvc use actions , view completly different way handle things. also, mvc not use viewstate normal .net controlls require. found article discussing mixing of normal .net , mvc.

c# - Extract content in paragraph Tags -


i have following html in string , have extract content in paragraph tags ideas??

link http://www.public-domain-content.com/books/coming_race/c1p1.shtml

i have tried

  const string html_tag_pattern = "<[^>]+.*?>";     static string striphtml(string inputstring)             {                 return regex.replace(inputstring, html_tag_pattern, string.empty);             } 

it removes html tags dont want remove tags because way how can content paragraph tags

secondly makes line breaks \n in text , and applying replace("\n","") dose not helps 1 problem when apply

int urlstart = e.result.indexof("<p>"), urlend = e.result.indexof("<p>&nbsp;</p></td>\r" );      string paragraph = e.result.substring(urlstart, urlend);      extractedcontent.text = paragraph.replace(environment.newline, ""); 

<p>&nbsp;</p></td>\r appears @ end of paragraph urlend dose not makes sure paragraph shown

the string extracted shown in visual studio alt text page downloaded webclient end of htmlpage

we provide ourselves ropes of\rsuitable length , strength- and- pardon me- must not\rdrink more to-night.  our hands , feet must steady and\rfirm tomorrow.\"\r<p>&nbsp;</p>     </td>\r    </tr>\r\r    <tr>\r     <td height=\"25\" width=\"10%\">\r     \r     </td><td height=\"25\" width=\"80%\" align=\"center\">\r       <font color=\"#ffffff\">\r       <font size=\"4\">1</font> &nbsp;\r       </font></td>\r     <td height=\"25\" width=\"10%\" align=\"right\"><a href=\"c2p1.shtml\">next</a></td>\r    </tr>\r   </table>\r  </center>\r</div>\r<p align=\"center\"><a href=\"index.shtml\"><b>the coming race -by- edward bulwer lytton</b></a></p>\r<p><b><center><a href=\"http://www.public-domain-content.com/encyclopedia.shtml\">encyclopedia</a> - <a href=\"http://www.public-domain-content.com/books.shtml\">books</a> - <a href=\"http://www.public-domain-content.com/religion.shtml\">religion<a/> - <a href=\"http://www.public-domain-content.com/links2.shtml\">links</a> - <a href=\"http://www.public-domain-content.com/\">home</a> - <a href=\"http://www.webmaster-headquarters.com/mb/\">message boards</a></b><br>this <a href=\"http://www.wikipedia.org/\">wikipedia</a> content licensed under <a href=\"http://www.gnu.org/copyleft/fdl.html\">gnu fr 

don't use regular expressions parse html. use html agility pack (or similar) instead.

a quick example, this:

htmldocument document = new htmldocument(); document.load("your_file_here.htm"); foreach(htmlnode paragraph in document.documentelement.selectnodes("//p")) {     // paragraph node here     string content = paragraph.innertext; // or similar } 

Comments

Popular posts from this blog

c# - Usage of Server Side Controls in MVC Frame work -

cocoa - Nesting arrays into NSDictionary object (Objective-C) -

ios - Very simple iPhone App crashes on UILabel settext -