How Google Search Engine Works
Let’s start with how Google search works. The first thing to understand is that when you do a Google search, you aren’t searching the web, you’re searching Google’s index of the internet, or at least as much of it as Google can find. Google does this with software programs called Spiders. Spiders start by fetching a few web pages then they follow the links on those pages and bring the pages they point to and so on until Google has indexed a pretty big lump of the web many trillion of pages stored over thousands of servers.
Now, think I desire to know how speed a cheetah can run. I type in my search, say, cheetah running speed and hit Enter. Google’s software searches his index to find every page that includes those search terms. In this case, there are hundreds of thousands of possible results.
How does Google decide which few documents I want? By asking questions more than 200 of them. Like, how many times does this page contain your keywords? Do the words appear in the title, in the URL, directly adjacent? Does the page include synonyms for those words? Is this page from a quality website, or is it low quality, even spamming? What is this page’s PageRank? That’s a formula invented by Google’s founders Larry Page and Sergey Brin that rates a web page’s importance by looking at how many outside links point to it, and how relevant those links are.
Finally, they combine all those factors to produce each page’s overall score and send you back your search results about half a second after you submit your search.
At Google, each entry includes a title, a URL, and a snippet of text to help users to decide whether this page is what I’m looking for. I also see links to similar pages, Google’s most recent stored version of that page, and related searches that I might want to try next. And sometimes, along the right and at the top, I’ll see adds. Google take their advertising business very seriously as well, both their commitment to deliver the best possible audience for advertisers and to strive to only show ads that you want to see.
They are very cautious to distinguish your ads from regular search results and they won’t show you any ads at all if they can’t find any that they think it will help you find the information you’re looking for which, in this case, the cheetah’s top running speed is more than 60 miles an hour. So this is a brief explanation of “How Google Search engine works”
Now there are some raw search techniques that we can implement on the google search box. To enhance our search result and fulfill the specified requirement from the google search.
Three Typical Types Of Users On Google Search Engine
As you saw the headline that there are three types of users for google.
- Mr. Searcher
- Mr. Developer
- Mr. Hacker
So let’s start with Mr. Searcher. Mr. Searcher is a simple user of google. He only needs some typical results from google search. Such as the best upcoming movies in 2020, the best actor in 2019, to open some social media sites, to search images, quotes, etc. For Mr. Searcher, I have some tricks that he can get a specific result as fast as possible.
site:facebook.com This keyword will be stuck on Facebook trending results.
site:facebook.com inurl:lamborghini This keyword will show Facebook results with the URL of Lamborghini.
Example 3: If you are an Instagram user, you can find based on hashtag topics or if you own business, try this query
site:twitter.com #kfc to get public feedbacks or discussions based on a specified hashtag.
Example 4: To find similar sites and topics we use query
related:titanic movie or
If you want to know about the usage of google search by Mr. Developers and Mr. Hackers, then continue reading.
So its time for Mr. Developer. Developers need many queries like an error occurred in program, installation, uninstalling, or using problems, or A Developer wants to connect with another specified developer.
Example 1: Suppose, Mr. Developer encountered an error in his program. To solve the error, he can copy the error message within the double quote and hit enter on the google search box. The sample query can be
Example 2: Mr. Developer facing problems in installing, uninstalling, or using software because some software download sections never specify the manual guide but you can use a simple query to find it
filetype:pdf "name of the software"
Example 3: General users don’t realize but Google is a specific user interface for advanced image search: Click here to upload or enter URL which type of image you want to search.
Note: There are some unexpected mistakes that developers should know, which is highly recommended if you are a developer or in the future you want to start developing.
For Mr. Hacker, Google search is a god-level tool. Hackers find hidden files from Google indexed pages, vulnerabilities related systems, passive recons, archived web, and many more. So let’s start.
Take a look at Table 1 (Google search query operators). The right query can generate some quite startling results. Let’s start with something simple. Suppose that a vulnerability is discovered in a popular application let’s say it’s the Microsoft IIS server version 5.0, and a hypothetical attacker determines to find a few computers running with this software to attack them. He will use a scanner of some description(nmap). But he prefers Google, so he enters the query
"Microsoft-IIS/5.0 Server at" intitle:index.of and receives links to the servers he needs (or, more specifically, links to autogenerated directory listings for those servers).
This appears because, in its official configuration, IIS (just like many other server applications) adds banners containing its name and version to some dynamically generated pages.
It’s a common example of Google search’s data which seems quite harmless, so it is generally ignored and remains in the regular configuration. Unluckily, it is also information which in certain circumstances, can be most valuable to a potential attacker.
Table1: Google search query operators
| ||Stick results to sites within the particularized domain|
word orvill, located within the *.teccon.co.in domain
|Sticks with the results to documents whose title contains the specified catchword|
title and amazon in the text
|Sticks with the results to documents whose title includes all the specified catchword|
|Sticks with the results to sites whose URL contains the specified catchword|
|Sticks with the results to sites whose URL includes all the stipulated catchwords|
|Sticks with the results to documents of the specified type|
with the word author
|Sticks with the results to documents including a number from the particularized range|
|Sticks with the results to sites including links to the particularized location|
|Sticks with the results to documents carrying the particularized catchword in the text, but not in the title, link descriptions or URLs|
|+||stipulates that a catchword should frequently occur in results|| |
the word amazon
|–||specifies that a catchword must not occur in search results|
|“”||delimiters for entire search catchwords(non-single words)|
|.||wildcard for a particular character|
|*||wildcard for a single word|
Table 2 contains sample queries using typical methods. The only way of preventing systems from publicly revealing the error Information is removing all bugs as soon as we can and (if possible) configuring applications to log any errors to files instead of displaying them for the users to see. Remember that even if you react quickly (and thus make the error pages indicated by Google out-of-date), a potential intruder will still be able to examine the version of the page cached by Google by simply clicking the link to the page copy. Fortunately, the sheer volume of web resources means that pages can only be cached for a relatively short time.
Table2: Error message queries
|Informix database errors, potentially containing function names, filenames, file structure information, portions of SQL code and passwords|
|authorization errors, potentially containing user names, function names, file structure information and pieces of SQL code|
|access-related PHP errors, potentially carrying filenames, function names and file structure information|
|Oracle database errors, potentially carrying filenames, function names and file structure information|
|Cocoon errors, potentially carrying Cocoon version information, filenames, function names, and file structure information|
|Invision Power Board bulletin board errors, potentially carrying function names, filenames, file structure information and piece of SQL code|
|MySQL database errors, potentially carrying user names, function names, filenames, and file structure information|
|CGI script errors, potentially containing information about the operating system and program versions, user names, filenames, and file structure information|
|MySQL database errors, potentially containing information about database structure and contents|
Table 3 presents some sample queries for password-related data. To make our passwords less accessible to intruders, we must carefully consider where and why we enter them, how they are stored, And what happens to them. If we’re in charge of a website, we should analyze the configuration of the applications we use, locate poorly protected or particularly sensitive data and take appropriate steps to secure it.
Both in European countries and the U.S., legal regulations are in place to protect privacy. Unfortunately, it is frequently the case that all sorts of confidential documents containing personal information are placed in publicly accessible locations or transmitted over the Web without proper protection.
To get complete knowledge, an intruder need only gain access to an e-mail repository containing the CV we sent out while looking for work. Address, phone number, date of birth, education, skills, work experience it’s all there. Thousands of such documents can be found on the Internet; just query Google for
intitle:"curriculum vitae" "phone * **" "address *" "e-mail". Finding contact information in the form of names, phone numbers, and email addresses are equally accessible. This is because most Internet users create electronic address books of some description. While these may be of little interest to your typical intruder, they can be dangerous tools in the hands of a skilled sociotechnical person.
A simple query such as
filetype:xls inurl:"email.xls" can be surprisingly effective, finding an Excel spreadsheet called email.xls. All the above also applies to instant messaging applications and their contact lists if an intruder obtains such a list, he may be able to pose as our friends. Interestingly enough, a fair amount of personal data can also be obtained from official documents, such as police reports, legal documents, or even medical history cards. The Web also contains documents that have been marked as confidential and therefore contain sensitive information. These may include project plans, technical documentation, surveys, reports, presentations, and a whole host of other company-internal materials.
Table3: Google queries for locating passwords
|passwords for site, stored as the string “http://username:password@www…”|
|file backups, potentially containing user names and passwords|
|mdb files, potentially containing password information|
|pwd.db files, potentially containing user names and encrypted passwords|
|directories whose names contain the words admin and backup|
|WS_FTP configuration files, potentially containing FTP server|
|files containing Microsoft FrontPage passwords|
|files containing SQL code and passwords inserted into a database|
|configuration files for the Trillian IM|
|configuration files for the Eggdrop ircbot|
|configuration files for OpenLDAP|
|configuration files for WV Dial|
|configuration files for the Eudora mail client|
|Microsoft Access files, potentially containing user account information|
|websites using Web Wiz Journal, which in its standard con-|
figuration allows access to the passwords file – just enter http:
///journal/journal.mdb instead of the default http:///
|websites using the DUclassified, DUcalendar, DUdirectory, DU-|
classmate, DUdownload, DUpaypal, DUforum or DUpics applica-
tions, which by default make it possible to obtain the passwords
file – for DUclassified, just enter http:///duClassified/ _
private/duclassified.mdb instead of http:///duClassified/
|websites using the Bitboard2 bulletin board application, which on|
default settings allow the passwords file to be obtained – enter
http:///forum/admin/data _ passwd.dat instead of the default
Table 4 presents several sample queries that reveal documents potentially containing personal information and sensitive data. As with passwords, all we can do to avoid disclosing private information, be cautious, and retain maximum control over published data. Companies and organizations should (and many are obliged to) specify and enforce rules, procedures, and standard practices for handling documents within the organization, complete with clearly defined responsibilities and penalties for infringements.
Table4: Searching for personal data and confidential documents
|email.xls files, potentially containing contact information|
|documents containing the confidential clause|
|AIM contacts list|
|Trillian IM contacts list|
|MSN contacts list|
|database files for the Quicken financial application|
|finances.xls files, potentially containing information on bank accounts, financial summaries, and credit card numbers|
|mail log files, potentially containing e-mail|
|reports for network security scans, penetration tests, etc.|
Google Database hacking
Google Database hacking, also termed Google Dorking, as I explained earlier, a computer hacking method that uses Google Search and other Google applications to find security loopholes in the configuration and computer code that websites use.
There are two types of dorks:
1.Public Dork: Which dorks are available for the public, with the explanation of the exploit. Example Exploit Database.
2.Private Dork: As its name is enough to understand private dorks are for private users.
By saving your search queries(Private Dorks) as tools. This will assist you to save even more time once you have found a valuable query that you want to use often. You can keep them as your very own shortcut or tool. For saving your search queries, you can use Google Chrome or Mozilla Firefox’s bookmark with an accessible name.
Last but not least, which is known as the Wayback machine or the archive for the internet. It’s a beneficial platform for tech users by which they can get detail information regarding the past structure of the website example: Google, Facebook, Youtube, even your website also if your site is indexed in it. Click here to visit the site.
So, that’s It. Hope you will like it. And if you have any doubt feel free to comment down below.
Related Post: Click here
For Bulk Information About Google: Click Here
For Seo: Click Here