How to make Jobmixi crawl your site:

Jobmixi.com pioneered with vertical search engine technology is trying to solve a different, more specific problem than a generalist one, focusing on the needs of job market and more specifically on job listings by visiting websites containing jobs and collecting details of the jobs to build a searchable index for Jobmixi database.

We are in a generation where people use search engines on a daily basis to find jobs online. Corporate websites who are in need of new talent can list their website on Jobmixi.com and whenever any job seeker searches for jobs with keywords matching the job listings the jobs from that particular corporate website is shown in the results. While there may be many pieces of the puzzle to accomplish this, one of the foundational things that can be done is to make sure your site is submitted to Jobmixi.com, Indian no: 1 job search engine.

Please contact us to list your website and jobs on Jobmixi.com

Jobmixi robot:

How to prevent Jobmixi from crawling your site:

If you wish to exclude your website from Jobmixi index, you can place a file at the root of your server called robots.txt. This is the standard protocol that most web crawlers observe for excluding a web server or directory from an index. Please note that mixibot does not interpret a 401/403 response ("Unauthorized"/"Forbidden") to a robots.txt fetch as a request not to crawl any pages on the site.

To remove your site from Jobmixi and prevent all robots from crawling it in the future, place the following robots.txt file in your server root:

User-agent: *
Disallow: /
To remove your site from Jobmixi only and prevent just mixibot from crawling your site in the future, place the following robots.txt file in your server root:

User-agent: mixibot
Disallow: /


Each port must have its own robots.txt file. In particular, if you serve content via both http and https, you'll need a separate robots.txt file for each of these protocols. For example, to allow mixibot to index all http pages but no https pages, you'd use the robots.txt files below.

For your http protocol (http://yourserver.com/robots.txt):

User-agent: *
Allow: /
For the https protocol (https://yourserver.com/robots.txt):

User-agent: *
Disallow: /

jobmixi will continue to exclude your site or directories from successive crawls if the robots.txt file exists in the web server root. If you do not have access to the root level of your server, you may place a robots.txt file at the same level as the files you want to remove.

All rights reserved © 2008 Raising Media India Pvt Ltd.