<rss xmlns:media="http://search.yahoo.com/mrss/" version="2.0">
<channel>
<generator>NFE/5.0</generator>
<title>"Apple" - Google News</title>
<link>
https://news.google.com/search?q=apple
</link>
<language>en-IN</language>
<webMaster>news-webmaster@google.com</webMaster>
<copyright>2019 Google Inc.</copyright>
<lastBuildDate>xxxxxxxxxxxxx</lastBuildDate>
<description>Google News</description>
<item>
<title>
HomePod available in China starting Friday, January 18 - Apple Newsroom
</title>
<link>
https://www.apple.com/newsroom/2019/01/HomePod-available-in-china-starting-friday-january-18/
</link>
<pubDate>Sun, 13 Jan 2019 21:57:48 GMT</pubDate>
<description>
<a href="https://www.apple.com/newsroom/2019/01/HomePod-available-in-china-starting-friday-january-18/" target="_blank">HomePod available in China starting Friday, January 18</a> <font color="#6f6f6f">Apple Newsroom</font><p>HomePod, the innovative wireless speaker from Apple, will be available in mainland China and Hong Kong markets starting Friday, January 18.</p>
</description>
<source url="https://www.apple.com">Apple Newsroom</source>
<media:content url="https://lh5.googleusercontent.com/proxy/3sb76nYiUcZoYNn3vMBTrH0dbNTM0r73U5lBdJdHlU10Y1o-8HfGmUBJhogpIrdmr4YybfRtSHUb7pdrbrIHmnT48bn-KzHiuNpha_GnkjyokluuT0WMbxZSn5oNO_Znmz550OL4XZAuEzfRx_Ai3KR11avjFAf9sNM6eLccqsXxMrniTtF4zvtcfso2n6MGO7pzbWM=-w150-h150-c" medium="image" width="150" height="150"/>
</item>
</channel>
There are two methods to parse the XML. But first let’s talk about the XML nodes of different types and how to extract of data from that nodes. After going through the article you’ll know why I’ve listed two methods. Here are some different types of XML nodes:
<item-name>
In such case you’ve to use $news->{'item-name'}
<item:name>
Data from such tags can only be retrieved by knowing the namespace of the XML.
What is XML namespace? Here
Now listing the two methods which will help you fetch the google news and then show in your website or you can just save in your database:
file_get_contents
and then manipulating the XML string (Recommended)
public function getNewsFromGoogle($query) { $newsXml = file_get_contents('https://news.google.com/rss/search?q=' . urlencode($query)); $newsXml = preg_replace("/(<\/?)(\w+):([^>]*>)/", "$1$2$3", $newsXml); // this will convert <media:content> to <mediacontent> $newsXml = simplexml_load_string($newsXml); $news = []; foreach ($newsXml->channel->item as $item) { $details = []; $title = (string) $item->title; if ($title == "This RSS feed URL is deprecated") { continue; } else { $details['title'] = trim($title); $details['description'] = trim(strip_tags((string) $item->description)); $published_date = (string) $item->pubDate; $published_date = date('Y-m-d H:i:s', strtotime($published_date)); $details['published_date'] = $published_date; $details['url'] = $item->link; if (isset($item->mediacontent)) { $details['image'] = $item->mediacontent["url"]; } else { $details['image'] = null; } array_push($news, $details); } } return $news; }
simplexml_load_file
public function getNewsFromGoogle($query) { $newsXml = simplexml_load_file('https://news.google.com/rss/search?q=' . urlencode($query)); $ns = $newsXml->getNamespaces(true); // use only if there are tags such as <media:content> i.e. with colon(:) $news = []; foreach ($newsXml->channel->item as $item) { $details = []; $media = $item->children($ns["media"]); // media is the namespace, refer the XML sample above $title = (string) $item->title; if ($title == "This RSS feed URL is deprecated") { continue; } else { $details['title'] = trim($title); $details['description'] = trim(strip_tags((string) $item->description)); $published_date = (string) $item->pubDate; $published_date = date('Y-m-d H:i:s', strtotime($published_date)); $details['published_date'] = $published_date; $details['url'] = $item->link; if (isset($media->content)) { $details['image'] = $media->content["url"]; // get value of attributes like these } else { $details['image'] = null; } array_push($news, $details); } } return $news; }
Note:
Enjoy
]]>apt-get update apt-get install python-software-properties apt-get install software-properties-common apt-get update
apt-get install zip unzip
add-apt-repository ppa:ondrej/php
If you get error here about Python Unicode, then it’s a problem of locale language set in your local machine. To fix you need to update locale and set, follow these steps:
locale-gen en_US.UTF-8 export LANG=en_US.UTF-8 export LC_ALL=en_US.UTF-8
apt-get update
apt-get install php7.2
apt-get install php7.2-curl php7.2-mysql apt-get install libapache2-mod-php7.2 php7.2-mbstring php7.2-xml php7.2-opcache php7.2-gd php7.2-zip
php -v