Daily I help teams with solution engineering aspect of connected vehicle data projects. (massive datasets & always some new datasets with new car models aka new technologies.) Lately in the spare time, applying some of the ML/Deep learning techniques on datasets (many are create based on observations of real datasets)To Share some thoughts on my work (main half of this blog) and the other half will be about my family and friends.
Thursday, July 03, 2008
Eclipse Ganymede C/C++ CDT IDE platform & my first impressions
1.Make sure you have a stable JDK on your machine ( in my case, I am using JDK 1.5
2.Download MinGW from http://www.mingw.org/
& install (make sure enabling g++ installtion option)&
update your system PATH setting
After step 2, in your windows command prompt
if you type g++ -v it will display the following.
>>>
C:\a_eclipse_c_c++\eclipse-cpp-ganymede-win32\eclipse>g++ -v
Reading specs from C:/MinGW/bin/../lib/gcc/mingw32/3.4.5/specs etc. etc.
>>
3. Download eclipse & CDT environment from the following location.
http://www.eclipse.org/cdt/index.php
4. After installing eclipse, start eclipse.
5. Start with File->new->project->c++ give default name etc.
6. At this stage you are ready go with new a class or import some existing file.
7. If you are experienced user with Eclipse IDE(i.e. for java code) building & running the code is very easy. Else your have play with Run-> new configuration.
Unfortunately Eclipse-> Help Content sucks.
I finished from step(2) to step(7) in less than 30 minutes. I wrote few c & c++ programs & everything looking good. More later.
Friday, February 15, 2008
Integrate Full Text search functionality in to your applications with Lucene (searching)
public List search(){
List searchResult = new ArrayList();
IndexSearcher indexSearcher = null;
a) Get Indexer
b) Build default query parser
( since it was demo. Accept all wild cards characters in your inputs.
c) Now search the indexes.
( we already built them in the earlier step)
Hits hits = indexer.search(query);
My conclusions are
Lucene is a Pure Java product that provides:
* ranked searching ; best results returned first{ i need to test more here}
* Good numbers of query types:
phrase queries, wildcard queries, proximity queries, range queries etc
* fielded searching (e.g., title, path, contents)
* date-range searching
* sorting by any field
* multiple-index searching ( I am working on this one right now)
* allows simultaneous update and searching
I am looking forward to C/C++ implementation.
Wednesday, January 23, 2008
Integrate Full Text search functionality in to your applications with Lucene (indexing)
Any Full Text search functionality involves indexing the data first. Lucene was no different in this approach. By indexing your data, it can perform high-performance full-text searching very fast. I did indexed 17,000 html files (my product documentation) in less than 5 minutes.
Creating Index writer & adding documents methods are key.
Rest f the methods for book keeping.
Following code indexes html, htm files in a folder. (It recursively iterates the nested folders & indexes each file)
////you data
public static final String dataDir = "D:\\webapps\\help";
//the directory that is used to store lucene index
private final String indexDir = "D:\\help_index";
public static String src1 = "";
public IndexWriter indexWriter;
public static int numF;
public static int numD;
public void openIndexWriter()throws IOException
{
Directory fsDirectory = FSDirectory.getDirectory(indexDir);
Analyzer analyzer = new StandardAnalyzer();
indexWriter = new IndexWriter(fsDirectory, true, analyzer);
indexWriter.setWriteLockTimeout(IndexWriter.WRITE_LOCK_TIMEOUT * 100 );
}
public void closeIndexWriter()throws IOException
{
indexWriter.optimize();
indexWriter.close();
}
public void indexFiles(String strPath) throws IOException
{
File src = new File(strPath);
if (src.isDirectory())
{
numD++;
String list[] = src.list();
try
{
for (int i = 0; i < list.length; i++)
{
src1 = src.getAbsolutePath() + File.separatorChar + list[i];
File file = new File(src1);
/*
* Try check like read/write access check etc.
*/
if ( file.isDirectory() )indexFiles(src1);
else
{
numF++;
if(src1.endsWith(".html") src1.endsWith(".htm")){
addDocument(src1, indexWriter);
}
}
}
}catch(java.lang.NullPointerException e){}
}
}
public boolean createIndex() throws IOException{
if(true == ifIndexExist()){
return true;
}
File dir = new File(dataDir);
if(!dir.exists()){
return false;
}
File[] htmls = dir.listFiles();
Directory fsDirectory = FSDirectory.getDirectory(indexDir);
Analyzer analyzer = new StandardAnalyzer();
IndexWriter indexWriter = new IndexWriter(fsDirectory, analyzer, true);
for(int i = 0; i < htmls.length; i++){
String htmlPath = htmls[i].getAbsolutePath();
if(htmlPath.endsWith(".html") htmlPath.endsWith(".htm")){
addDocument(htmlPath, indexWriter);
}
}
return true;
}
/**
* Add one document to the lucene index
*/
public void addDocument(String htmlPath, IndexWriter indexWriter){
//System.out.println("\n adding file to index "+htmlPath );
HTMLDocParser htmlParser = new HTMLDocParser(htmlPath);
String path = htmlParser.getPath();
String title = htmlParser.getTitle();
Reader content = htmlParser.getContent();
Document document = new Document();
document.add(new Field("path",path,Field.Store.YES,Field.Index.NO));
document.add(new Field("title",title,Field.Store.YES,Field.Index.TOKENIZED));
document.add(new Field("content",content));
try {
indexWriter.addDocument(document);
} catch (IOException e) {
e.printStackTrace();
}
}
im.openIndexWriter();
File src = new File(dataDir);
if(!src.exists()){
System.out.println("\n DATA DIR DOES NOT EXISTS" );
return;
}
long start = System.currentTimeMillis();
System.out.println("\n INDEXING STARTED" );
im.indexFiles(dataDir);
im.closeIndexWriter();
long end = System.currentTimeMillis();
long diff = (end-start)/1000;
System.out.println("\n Time consumed in Index the whole help=" +diff );
System.out.println("Number of files :\t"+numF);
System.out.println("Number of dirs :\t"+numD);
}
Friday, January 11, 2008
Populating MySql table from MS Excel { aka .csv} file
{I felt Apache, PHP & MySQL combination fits their need.
I will explain about that application little later.}
I have already received some data in MS Excel file.
Strangely some trailing columns are missing in some records after saving Excel file in to csv file.
Following command fails saying column truncated MySQL errors.
mysql> load data infile 'C://bea//temple//dpexport.csv' into table donar
_info_4 fields terminated by ',' OPTIONALLY ENCLOSED BY '"' Lines terminated by
'\n';
After spend little more time with MySQL documentation,
I found out that IGNORE option does the magic &
I am able to load the csv files in my SQL table. Just add IGNORE next to infile.
Wednesday, January 02, 2008
Amazon Kindle is the next IPOD?
Suddenly a young man sat next to me, seriously browsing web with his Amazon Kindle. (What made me curious was, he was reading one of my favorite web site, New York Times.) As I was paying attention to gadget , slowly he started talking all the good things about his new gadget & offered me to feel it. Quickly I checked the weight & look and feel of a blog & an e-book. Not so heavy & look and feel was very good & natural. I liked it a lot. I was so tempted after coming home, I checked Amazon website for Kindle.
(Little pricy, It was sold out & seems to be there is lot of demand for the gadget) Most of the reviews are positive. This incident remembers me another old incident with first experience with IPOD. Nearly4+ year's back, while coming back from my vacation, (India->London-> US), one person sitting next to me was explaining & talking & proud of owning a new IPOD. (If I remember correctly, it was the second month after IPOD launch.) Now I am seeing the same thing happening for Amazon's Kindle. I liked the way it was designed (automatic download content to the device)& convenience & thought process behind the Kindle. I was so surprised that Amazon came up with this kind of device. {After its initial launch as a major on line book store, this was the best thing from Amazon. My personal opinion. } Hoping that I will own Amazon Kindle very soon. It fits in to my taste.
Wednesday, December 19, 2007
An update & my progress in Kiva
Following is my kiva lender page http://www.kiva.org/lender/dhasa
Friday, November 16, 2007
Java DefaultListModel performance issues
Recently I received a big customer escalation on search domain.
Basically end user was searching for enterprise information based on end user input criteria.
In our java rich client, we are showing a simple dialog to select the list of users in the enterprise. This action was consuming 10+ minutes.
Real culprit was UI works fine for simple 100 to 1000 users.
However customer is testing with 10K plus users.
After analyzing the all the code at server side & finally I looked in the client layer.
At client, server data is getting added to DefaultListModel with addElement() in a for loop.
Real culprit is addElement() method.
After seeing the implementation of the above method & its sequence of event calls &
Little bit of browsing the java forums, I found out that we should use the above class for large lists. Yes never use DefaultListModel directly. Still this problem exists in JDK 1.5 version. There are multiple solutions to this problem. Just Google it. You will find many.
I made a quick fix based on some suns forum advice. (Basically it is fast & I am seeing 90% improvement)
Steps:
1) Extend your DefaultListModel as shown below
class FastListModel extends DefaultListModel { private boolean listenersEnabled = true;
public boolean getListenersEnabled() { return listenersEnabled; }
public void setListenersEnabled(boolean enabled) { listenersEnabled = enabled; } public void fireIntervalAdded(Object source, int index0, int index1) { if (getListenersEnabled()) { super.fireIntervalAdded(source, index0, index1); } }
}
2) Add list listener to your list model
ListDataListener listener = new ListDataListener() { public void intervalAdded(ListDataEvent e) { list.ensureIndexIsVisible(e.getIndex1()); } public void intervalRemoved(ListDataEvent e) { } public void contentsChanged(ListDataEvent e) { } }; model.addListDataListener(listener);
3) Turn on & off listener explicitly
model.setListenersEnabled(false);
//add content for(int i = 0; i <>
// now enable the listers
model.setListenersEnabled(true);
Wednesday, October 03, 2007
New Java Runtime methods
Java Runtime methods like maxMemory(), freeMemory() totalmemory() etc. to know your application or your module memory usage. It always helps.
Also use availableProcessors () elegantly if you application is spanning too many threads.
Thursday, September 06, 2007
hooray XSLT now part of JDK 1.5
(Thanks god, we don’t have to download xalan, xerces etc.
& setting big class paths to author style sheets.)
Minus points are Java implementation of xalan sucks badly.
It is working fine for simple transformation use cases &
but failing for the complex cases.
(I will write those test cases in the next post)
Following is the sample Test.java to transform the input xml in to HTML ( or another other output) using java Xalan libs.
import javax.xml.parsers.DocumentBuilderFactory;
import javax.xml.transform.Source;
import javax.xml.transform.Transformer;
import javax.xml.transform.TransformerFactory;
import javax.xml.transform.dom.DOMSource;
import javax.xml.transform.stream.StreamResult;
import javax.xml.transform.stream.StreamSource;
import org.w3c.dom.Document;
import java.io.FileInputStream;
import java.io.FileOutputStream;
public class Test
{
public static void main(String[] args) throws Exception
{
Source source = new StreamSource(new FileInputStream("C://AE_html.xsl"));
Transformer t = TransformerFactory.newInstance().newTransformer(source);
Document doc = DocumentBuilderFactory.newInstance().newDocumentBuilder().parse(new FileInputStream("c://users.xml"));
System.out.println("Transforming...");
t.transform(new DOMSource(doc),new StreamResult ( new FileOutputStream("C://users.html")) ); }
}
//System.out.println( "doc as text content"+ doc.getTextContent() );
//t.transform(new DOMSource(doc), new StreamResult(System.out));
Thursday, August 30, 2007
Tuesday, August 28, 2007
Monday, August 27, 2007
Monday, June 04, 2007
More online references
Scholarpedia
www.scholarpedia.org
Conservapedia
www.conservapedia.org
Citizendium
www.citizendium.org
Wednesday, April 11, 2007
Became a member in kiva
But following program was really inspiring. (It is a small part in Frontline)
(A Little Goes a Long Way)
Watch it online first.
http://www.pbs.org/frontlineworld/stories/uganda601/
Immediately decided to be a member in kiva.
I believe in the concept (microfinance & direct lending & to the needy.)
I watched many programs on microfinance.
But I never know how to be part of it or Can an individual can join.
www.kava.org helped me to contribute.
At present I am thinking of contributing to the same for another one year.
Started with 6 & my goal is help 50 all the time.
Please read the FAQ etc in the www.kiva.org before jumping.
“It's a new, direct and sustainable way to fight global poverty, and the way I see it, I get a higher return on $25 helping someone build a future than the interest my checking account pays. “
Following is my lender page in kiva.
http://www.kiva.org/lender/dhasa