Skip to main content

Posts

Web scraping using Python package Goose

Web scraping is one of the powerful technique used to collect large amounts of data from internet. Companies with quality data strive in today's world when it comes to Machine learning. Let's take a scenario. You set out to build worlds best restaurant review classification system. You collect all the reviews from several restaurants and use a fancy deep learning algorithm to do the classification.Turns out your classification algorithm is not doing well out in public. What went wrong ? Well, machine learning is all about capturing the pattern and generalizing it so well that unseen data will also work well. Given the situation you are in, you have these options. Try GPU, incorporate latest ML techniques, build an ensemble of many models, revisit feature engineering... or Get more data. As trivial as it might sound, fetching more data would enable any ML algorithm to capture more pattern with in the data and perform well on unseen data. I am going to talk about not so f...
Recent posts

Wokring with Google cloud for speech to text for indian languages and translation

Google Cloud API In this post, I would like to share my experiences on using Google cloud. I have created a account under Google cloud program. Using this, I will develop python code to convert a audio clip from Indian languages to text. Later, I will use Google translate API to convert the text to english. This post will be updated once I have written code samples. I will share my code and the implementation and demo. To be continued...

Text classification using CNN written in tensorflow.

Problem statement : You are supposed to build a model which automatically classifies an article under Finance, Law, Fashion and Lifestyle. Use the data from leading magazines for training the model. Solution :   Github Repo :  link In past, I had used NLTK and python to solve the above problem, but neural networks have proven to be more accurate when it comes to NLP. I had researched on text classification libraries and different approaches to solve this problem and decided to use CNN. I have used Denny Britz code for implementing the CNN( convolutional neural network ). Here is the  link for his blog post. I would describe the files and the procedure I followed to get the data, train the model, test the model and the results. First, I went to the leading newspaper TheGuardian and looked for the labels i.e Finance, Law, Fashion, Lifestyle. Scraping the data from the same source would be help in keeping the homogeneity in the articles. I have used ...

Back to blogging after two long years!!

What have I been upto ? Once I left Magarpatta(June 20, 2014), Pune, I was excited to know that I was capable of doing great work in technology. I worked on four projects at the same time, used to work with multiple managers at the same time. I learned that I can contribute to Opensource ( Openstack ). By contributing to opensource, you gain wealth of knowledge. I think if you put in one hour of your time to contribute to FOSS, you gain 10 x knowledge and exposure. As a fresh graduate from Bits Pilani Goa, it was a tremendous learning. I was talking directly to several developers, architects, industry experts, read about what companies work and their business. At that point, I got a whole new perspective of how things work in the industry, how opensource is powering the world. Coming to present, un-employed :P :P as of June 10, 2016. For the past 2 years, I was working as Research and Development engineer at IPsoft Labs, Bangalore. Working there, I had got exposure to lot of n...

Infinite recursion ;)

My blog post has been featured in Openstack super user, Mirantis tutorials, Opensource.com My first attempt at blogging and it has been featured in Openstack Superuser, Mirantis tutorials and Opensource.com. It gives me a great motivation to learn more and more about openstack and blog about the latest innovations happening around Openstack.  Openstack Super user : http://superuser.openstack.org/articles/how-to-contribute-to-openstack Mirantis Tutorials(14th in Additional) : http://www.mirantis.com/openstack-portal/external-tutorials/tutorials-week-may-19-2014-may-26-2014/ Opensource.com(How to contribute to Openstack) : http://opensource.com/business/14/7/openstack-news-july-21 In future, I would like to learn more and more about Openstack and I would simultaneously update my blog. Stay tuned ;)

My contribution to Openstack

Contribution to Openstack Stackalytics: Stackalytics is one stop for looking at some one's contribution to Openstack. It has great user interface where we can find the modules user has contributed to, timeline of his contributions. Here is my Stackalytics page link . I have mainly contributed for Openstack-manuals, documentation for Openstack projects about installation, user guides, admin guides and Openstack-API which hosts the documentation related to the API of several Openstack components.