Category: Web Technology


We frequently need to grab data from remote site and there are several processes to do that. I liked process where we can garb using XML Path(xPath). XPath is used to navigate through elements and attributes in an XML document. XPath return a node result set to the calling method or application. A node is a complete element within an XML document.

Here is a process whether you can grab data from HTML/XHTML source using XPath and cURL.

The Basic Xpath Class:

Class xpath{
	public $html;
	public $patten;
	public $childnodes = 0;
	public $attribute = 0;
	public $forbidden = 0;
	public $search = 0;
	public $br2nl = 1;
	public $return = "string";
	public $charset = "utf8";
	private $retrund;
	
	function __construct($html, $patten){
		$this->html = $html; 
		$this->patten = $patten;
	}
	public function execute(){
		$xpath = new DOMXPath($this->html); 
		$basenodes = $xpath->query($this->patten);
		foreach ($basenodes as $basenode){
			if($this->childnodes){
				foreach($basenode->childNodes as $childnode){
					$this->buffer($childnode);
				}
			}else{
				$this->buffer($basenode);
			}
		}
	}
	private function buffer($value){
		$preparedvalue = $this->prepare($value);
		if($preparedvalue){
			if($this->returnstring()){ 
				$this->returnd .= $preparedvalue;
			}else{
				$this->returnd[] = $preparedvalue;
			}
		}
	}
	private function returnstring(){
		if($this->return == "string"){
			return true;
		}else{
			return false;
		}
	}
	private function returnutf8(){
		if($this->charset == "utf8"){
			return true;
		}else{
			return false;
		}
	}
	private function prepare($before){
		if($before->tagName == "br" AND $this->br2nl){
			$before = "\n";
		}else{
			if($this->attribute){
				if($this->search){
					if(in_array($before->textContent, $this->search)){
						$before = $before->getAttribute($this->attribute);	
					}else{
						$before = 0;
					}
				}else{
					$before = $before->getAttribute($this->attribute);
				}
			}else{
				if($this->search){
					if(in_array($before->textContent, $this->search)){
						$before = trim($before->textContent);
					}
				}else{
					$before = trim($before->textContent);
				}
			}
		}
		if(!$this->returnutf8()){
			$before = utf8_decode($before);
		}
		if($this->forbidden){
			if(in_array($before, $this->forbidden)){
				return false;
			}else{
				return $before;
			}
		}else{
			return $before;
		}
	}
	public function get(){
		return $this->returnd;
	}
}

We have to create xpath object to grab data using HTML ids/classes as nodes. At first, We need to grab page source using PHP cURL. When you have that source you can easily filter data using attributes/nodes.

Here is the example of fetching data using node and the process of reading HTML attributes.

	$grab_path = trim("Your URL goes here");
				
	$grab_path = str_replace("&","&",$grab_path);
	$url = urlencode($grab_path);
	$html = new DOMDocument(); 
	$html->loadHtmlFile($url); 
	
    $description = new xpath($html, '//div[@class="body_container"]'); // div id name
    $description->return = "string";
    $description->childnodes = 1;
    $description->forbidden = array("Body");
    $description->execute();

    $title = new xpath($html, '//h1'); //Grab h1 data 
    $title->execute();

    $image = new xpath($html, '//*[@id="main_image"]'); //Grab Images
    $image->attribute = "src";
    $image->execute();

    $adparams = new xpath($html, '//table[@class="params UnderlinedLinks"]/tr[1]/td[1]'); //fetch tables
    $adparams->childnodes = 1;
    $adparams->return = "array";
    $adparams->execute();

   $info[] = array(
		'title_name' => $title->get(), 
		'image' => $image->get(), 
	);

Its very effective to retrieve data from HTML source and its easily modifiable according to CSS and HTML changes.

Advertisements

Is Facebook moving Faster?

Facebook is moving really fast and I feel within a very short time it will be the Most Highly Traffic site on the web.

Look at the progress graph in last one year by compete.com among 3 most highly traffic site of the web (Google, Yahoo, Faebook).  Can you understand how facebook is moving?

By Unique Visits:

www-facebook-com-www-google-co_uv_1y

By Total visits:

www-facebook-com-www-google-co_sess_1y

Facebook already has positioned 2nd by pushing Yahoo in the 3rd and threatening Google to take the hot seat and the Graph shows that is not very far away from now.

How SSL Works?

SSL technology relies on the concept of public key cryptography to accomplish its tasks. In normal encryption, two communicating parties each share a password or key, and this is used to both encrypt and decrypt messages. While this is a very simple and efficient method, it doesn’t solve the problem of giving the password to someone you have not yet met or trust.

In public key cryptography, each party has two keys, a public key and a private key. Information encrypted with a person’s public key can only be decrypted with the private key and vice versa. Each user publicly tells the world what his public key is but keeps his private key for himself.

The SSL handshake protocol determines how the server and client negotiate which cipher suites they will use to authenticate each other, to transmit certificates, and to establish session keys.

  • SSL structure builds with  public key cryptography
  • In SSL there are three (3) steps of Encryption.
Key Exchange Cipher Encryption Hashing
RSA

Diffie Hellman

DSA

AES

DES

RC4

MD5

SHA

How SSL Works

I. Obtaining an SSL Certificate

XYZ Inc., intends to secure their customer checkout process, account management, and internal employee correspondence on their website, xyz.com.

Step 1: XYZ creates a Certificate Signing Request (CSR) and during this process, a private key is generated.

Step 2: XYZ goes to a trusted, third party Certificate Authority . Certificate Authority takes the certificate signing request and validates XYZ in a two step process. Certificate Authority validates that XYZ has control of the domain xyz.com and that XYZ Inc. is an official organization listed in public government records.

Step 3: When the validation process is complete, Certificate Authority gives XYZ a new public key (certificate) encrypted with Certificate Authority‘s private key.

Step 4: XYZ installs the certificate on their webserver(s).

II. How Customers Communicate with the Server using SSL

ssl-handshake

Step 1: A customer makes a connection to xyz.com on an SSL port, typically 443. This connection is denoted with https instead of http.

Step 2: xyz.com sends back its public key to the customer. Once customer receives it, his/her browser decides if it is alright to proceed.

  • The xyz.com public key must NOT be expired
  • The xyz.com public key must be for xyz.com only
  • The client must have the public key for Certificate Authority installed in their browser certificate store. If the customer has Certificate Authority trusted public key, then they can trust that they are really communicating with XYZ, Inc.

Step 3: If the customer decides to trust the certificate, then the customer will be sent to xyz.com his/her public key.

Step 4: xyz.com will next create a unique hash and encrypt it using both the customer’s public key and xyz.com’s private key, and send this back to the client.

Step 5: Customer’s browser will decrypt the hash. This process shows that the xyz.com sent the hash and only the customer is able to read it.

Step 6: Customer and website can now securely exchange information.

SEO Tools

Now-a-days, We can’t think of a website without Search Engine Optimization. SEO is the #1 marketing strategy for promoting a site.

http://www.searchenginegenie.com/

This is a pretty nice and Cool. 🙂

The following link shows some nice SEO tools widget:

http://www.searchenginegenie.com/seo-tools.htm#Widget

Google Wave is a new web technology introduced by google Inc. which is so powerful and exciting Web application. Google Wave will be a powerful web service which will more user friendly and make life easier for us . Hopefully Google will release this application by the end of this year.

Watch this video. You will understand what it is.

Google Website Optimizer is a free A/B testing and multivariate testing application that helps online marketers and webmasters increase visitor conversion rates and overall visitor satisfaction by continually testing different combinations of website content. Google website optimizer can test any element that exists as HTML code on a page including calls to action, fonts, headlines, point of action assurances, product copy, product images, product reviews, and forms.

There are two types website optimization technique.

1. A/B testing

2. Multivariate Testing.

A/B Testing:

An A/B experiment allows you to test the performance of two (or more!) entirely different versions of a page. Start with your original test page — the page whose content you want to test — then create alternate versions of that page. You can change the content of a page, alter the look and feel, or move around the layout of your alternate pages — whatever you choose. We’ll vary traffic to your original page and your alternate versions, to see what users respond to best.

Multivariate Test:

The Motivity platform includes a Google Website Optimizer module that allows you to test your website content and ecommerce pages. In order to test pages you normally need to insert snippets of JavaScript code provided by Google into your pages which most content management and ecommerce systems currently don’t support. But this CMS platform does this automatically on every page and on many of the standard content areas of your site. In addition, you can use the button in any of the WYSIWYG site editors to add the Google Website Optimizer tags in order to make additional sections of the page testable. Another great feature is that you can manage all of your A/B and Multivariate tests from a centralized location and see which pages are being tested in each of your tests.