Monday, August 25, 2014

Sentiment Analysis on Tweets of Pakistani Journalists

The edited version of this blog is published on Express Tribune on October 27, 2014

The events taking place around us affect our feeling. Our feelings affect our conversation during normal life.  In the last three weeks, major event of Azadi March is happening in Pakistan. This event is affecting the feeling of average Pakistani on the road. Being related to data mining and text mining field, I carried out a little experiment to check out the feeling/sentiments of Pakistani journalists who are reporting the current event in Pakistan. I took the journalists as test case because people listen to them and are affected by their feelings.

 Twitter data is normally used for understanding the feelings of the people. Researchers in USA have used twitter data to understand the feelings of the people during Presidential election. To understand the feelings of the Pakistani journalists, I used the Sentiment analysis technique on the tweets of the journalists from the last three weeks.  I used twitteR library of R-software to extract the tweets and Datumbox twitter Sentiment analysis API to rate the sentiment of each tweet as positive, negative and neutral depending upon the context. I used the last three week tweets of Cyril Almeida, Fahad Hussain, Fereeha, Hamid Mir, Iftikhar Ahmad, Jasmeen Manzoor, Javed Chaudhry, Kashif Abbasi, Moeed Pirzada, Mushtaq Minhas, Rauf Kalasra, Raza Rumi, Shahzeb Khanzada, and Talat Hussain for my experiment. Due to limitation of time, I wasn’t able to conduct experiment on other journalists.   

The results obtained from the experiment are very interesting. Moeed Pirzada and Fahad Hussain sentiments are obtained as most positive among all of their peers. Mushtaq Minhas appears up as the one who tops in negative sentiment. For neutral sentiments, Hamid Mir tops the group. Not only his neutral sentiments were on the top but positive and negative sentiments were too low that it appears that he can hide his feelings more than among his peers. Similar is the case with Iftikhar Ahmad, Rauf Kalasra, and Shahzeb Khanzada. Amount of positive sentiments remains higher than negative for Fahad Hussain, Hamid Mir, Jasmeen Manzoor, Javed Chaudhry, Moeed Pirzada, and Raza Rumi. Cyril Almeida, Fereeha, Jasmeen Manzoor, and Kashif Abbasi try to balance their sentiments in their tweets as a result their neutral sentiments remains lower than their positive and negative sentiment. Mushtaq Minhaz neutral sentiments were also lower than positive and negative but that is due to the fact that most of his sentiments were judged as negative. Cyril Almeida, Fereeha, Kashif Abbasi, Mushtaq Minhas, Rauf Kalasra, and Talat Husain comes up as the journalist who are spreading negative sentiments using their tweets. The results of the sentiment analysis are shown in Table 1 and Figure 1.

In the future, if I will get time, I will conduct an experiment to read the sentiments of the people replying to these journalists to understand how much affect people are taking from the sentiments of these journalists. How they are replying back on getting the implicit sentiment from journalists in their tweets. Till then kindly reply back to me that do you agree with the Datumbox twitter sentiment analysis  engine results about the sentiments of the journalists or not?

Figure1: Graph of Sentiment Analysis Experiment
Table 1: Results of Sentiment Analysis Experiment
Positive%
Negative%
Neutral%
Cyril Almeida (@cyalm)
43.1
46.55
10.34
Fahad Hussain (@Fahdhusain)
50.88
21.05
28.07
Fereeha (@Fereeha)
40.3
43.28
16.42
Hamid Mir (@HamidMirGEO)
21.31
18.03
60.66
Iftikhar Ahmad (@jawabdeyh)
26.67
26.67
46.67
Jasmeen Manzoor (@jasmeenmanzoor)
39.68
33.33
26.98
Javed Chaudhry (@javedchoudhry)
39.24
20.25
40.51
Kashif Abbasi (@Kashifabbasiary)
36.73
46.94
16.33
Moeed Pirzada (@MoeedNj)
52
24
24
Mushtaq Minhas (@mushtaqminhas)
31.91
65.96
2.13
Rauf Kalasra (@KlasraRauf)
25.42
27.12
47.46
Raza Rumi (@Razarumi)
38
21
41
Shahzeb Khanzada (@shahzebkhanzda)
28.12
28.12
43.75
Talat Hussain (@TalatHussain12)
25
30
45

Wednesday, July 10, 2013

Halalgoogling: the constructive feedback

If I were white, I'd get less criticism: Lenny Kravitz. The same principle applies with the newly launched search engine "Halalgoogling". If it was not "halalafied" or "for muslims", it would have got less negative criticism. Express tribune have recently published a blog criticizing the search engine. The focal point of aforementioned blog was to target the name of the blog by showing that search engine is not able to filter few  so called "haram" words. Is the criticism on "halalgoogling" constructive, I don't think so.

Criticizing the search engine for not being able to filter few not commonly used words, is not justified. The blog published in Express tribune shows that author was more in a hurry to write the blog than to read what the developers have said about the limitations. According to the developers of "halalgoogling" in their blog, if anyone finds objectionable word that is not being filtered, should report or suggest  them. Therefore all the criticism so far on the search engine is already answered by the developers.

If we are really sincere to give feedback, we should give constructive feedback that should help the developers to improve their work. I can summarize the following points regardings the search engine that should be asked from the developers.

  1. Why i should use "halalgoogling" instead of google with safe search? You people have done some really good work technically. However, I don't find any reason to use your search engine. Selling your product by attaching the religion with it, is a recipe to fail. You have numerious examples of such products in a Pakistani market.
  2. The length of the name "halalgoogling" is also an obsticle for me to use it. Have you ever wonder why "Yahoo", "Google", and "Bing" were so quickly adopted by the people all over the world? The length of the name "halalgoogling" will certainly become obstacle in adopting your search engine.
  3. The word Google is not a synonym for the word "search". It is a "aween" word and now a propritey name for a search engine company.  You have used their name in your search engine name. why?
  4. Will I get better search results through your search engine? Do you have developed a new "Search algorithm"? If so, when you are going to  patent it? Might be Google purchase it from you as they also paid 336 million US dollars to Sanford University to use their patent algorithm "PageRank". The Google got more popularity than the other search engines in 1999 and 2000, only due to the strength of PageRank algorithm to produce better results. Does your algorithm is comparable to PageRank?
  5. Why there is no credits to real search engine in the results if you are just using filters over someone else search results? If you are using "Google Api" you must mention it somewhere. On one side, you are attaching religion and on otherside you are not giving credit to the real search engine company. Doesn't it contradict with the islamic values?
  6. The performance of a filtering system goes down with the increase in the filtered word list. How you can convince me that I don't have to wait longer than the renowned search engine? What will be the value of a search engine if i have to wait for longer time for the results?
  7. Menu bar is not stable. The  login link  appeared after i search some words. The link didn't responded at all. It just disappeared when I moved back to the first page. This should not happen in a professional website or a search engine.
  8. Is my privacy safe on your "halalgoogling" search engine? Do your search engine keep history of my search results? What is the purpose of "login" in a search engine if you donot store my personalized searches? If you store the searches, how should I be sure enough of my privacy? Will you target my thoughts using data mining techniques to judge my behaviour and likings?


To summarize all of the above, I would say that "halalgoogling" is a good effort but not convincing enough to force me to use it. The developers should present something other than the religion, to sell their product, otherwise, negative feedback will keep on rolling towards them.

Monday, July 30, 2012

Electrolysis and Pakistani Media

Almost every other day, a story use to pop up in a Pakistani media that a technician in some part of Pakistan has developed a method to produce electricity by burning water or have developed a method to run a car using water as a fuel. Recently Hamid Mir a renowned TV anchor have broadcasted a TV show on similar conceptReally is it possible to use Water as a fuel?

Water is not a fuel and it doesn’t burn but hydrogen is. Water is a combination of Hydrogen and oxygen.  Unfortunately hydrogen is not freely available. To use hydrogen as a fuel, water molecules have to be broken down. Most common method of breaking the water molecule is renowned as electrolysis. Electricity is used in this method to separate Hydrogen and oxygen molecules. Every matric/FSC student tests this method in their school labs. Now these people are using electrolysis to generate hydrogen which is then provided to the engines via air intake pipe just like CNG to burn hydrogen and run the engine.

But wait a minute, what is the actual source of energy which is running the engine? It’s the electricity.  Without electricity, water molecule will not be broken down; hydrogen will not be produced and will not be burned in the engine. From where this free electricity comes from and at what cost? You are using electricity to run engine or generate electricity.

Pakistani media is extremely strange and it is evident that instead of doing the cost analysis of the electrolysis process, TV anchor Hammid Mir tested the system by driving the car and do you expect what he said

“Ye to bilkul normal gari ki tarah chal rahi hay”
(It is running like a normal car)

What do you have expected, should car have to fly with water or hydrogen fuel. As it was running normal so you have declared it the cheapest method. You and those technicians should be given Noble Prize.

According to Hammid Mir, Oil industry is not allowing to use water as alternative fuel. In my view it is not correct. Its the over all cost (including maintenance cost) of cars running on electrolysis which is not allowing them to be used as alternative.

The inventor Agha Waqar, challenged Dr Atta ur Rehman in another show, hosted by TV anchor Talat Hussain,

“Ager battery dead bi ho, app Dhaka laga k gari start ker lain to battery to charge ho jati ha“

So, according to him, battery will never be consumed and it’s a complete cycle where more energy will be produced with less energy consumption. But this is not true, we have to look at the complete cycle in this case to understand what is happening behind it.

Cycle:
Car will run on gasoline to run the car’s motor. Energy will be lost due to heat and friction and its alternator will be operated due to the motion of the motor which result in creating electricity and that will be stored in battery. But again energy will be lost during this process due to friction and electrical resistance. Electricity from the battery will be used to separate hydrogen using electrolysis. Energy will be again lost in the form of heat and other chemical reactions if water will not be pure. Hydrogen will be then used to run the engine instead of gasoline.

In each cycle, lot of energy will be wasted. Each time, battery will be charged less due to losses and as a result, less hydrogen will be produced. After some time, battery will have to be charged again via gasoline or some external method.

On an average 5x electricity is consumed to generate 1x electricity via this method.

1000 m3 of H2 will generate 1MWh of electricity and 1MWh of heat.

In the last but not least, making water distilled also consume energy mostly in the form of electricity or burning of oil.

Agha Waqar misconceptions:
1. Sound system in cars also consume battery power, so what's a big deal ? Bahi, sound system is only the consumer of energy, in your system battery power is the starting point, without electricity whole system will not work. Both of them are two separate things altogether.

2. It will save millions of rupees as import of oil will be reduced. How, than who will charge the batteries. For charging the batteries, you have to burn oil or any other method. Oil will still be required and at the same time, due to heavy usage of batteries, their prices will also shoot at the same time.

3. Very little battery power is being utilized. Ok, but how much. Have you calculated that?

4. Car is moving so the method is the best. Sorry unless you will not show that overall cost of using kit is less than the cost of oil saved by this method, this can not be said that it is a cheap method.


Missing points:
As the discussion is getting long I will not explain the other missing points which are not being addressed in the media about this method. Like, cost of electricity for electrolysis, life time of electrodes being used in electrolysis process and their cost, cost of distilled water, effect of hydrogen burring on engine efficiency etc etc. At the same time Hydrogen is odorless gas, so it is very hard to detect incase of leakage which brings the safety expects in concern as well.  If you bring all this into account than one can assume that this is not cheap method at all.

To summarize all of the above I would say, media should study before presenting something as free energy source. It is good that many technicians and scientists are working for alternate source of energy but electrolysis or water is not a free source of energy or alternative fuel unless you prove theoretically as well that overall cost of this kit is less than the cost of oil saved in this method.