Daily I help teams with solution engineering aspect of connected vehicle data projects. (massive datasets & always some new datasets with new car models aka new technologies.) Lately in the spare time, applying some of the ML/Deep learning techniques on datasets (many are create based on observations of real datasets)To Share some thoughts on my work (main half of this blog) and the other half will be about my family and friends.
Tuesday, June 11, 2013
SOLR facet component customizations
Tuesday, May 28, 2013
SOLR 4.2 features: Rest API to fetch SOLR schema, fields etc in JSON
& parsing the SOLR response structures etc in their logic.
If you are a pure JSON geek, this aspect turn off.
However starting with Solr 4.2 onwards, now SOLR supports
REST API to request the the schema in json format.
Not only entire schema file, one can request, just few fields or field types,
dynamic fields, copy fields etc.
Wish they support wild cards. May be for future.
For now this is solid beginning.
Entier schema
http://localhost:8983/solr/collection1/schema?wt=json
Request price field:
http://localhost:8983/solr/collection1/schema/fields/price?wt=json
Request dynamic field ending with _I
http://localhost:8983/solr/collection1/schema/dynamicfields/*_i?wt=json
Request data field type.
http://localhost:8983/solr/collection1/schema/fieldtypes/date?wt=json
Old Style:
LukeRequest request = new LukeRequest();
request.setShowSchema(true);
request.setMethod(METHOD.GET);
LukeResponse response = request.process(getServer());
if(response != null && response.getFieldInfo() != null)
{ Map
Friday, May 24, 2013
Sample R code to count number of the terms in end user queries & plot
dups<-function p=""> df<-read .csv="" csv="" input="" nbsp="" p="" query="" read="" strip.white="TRUE)" temr="">
df[[1]] <- any="" cleanup="" df="" fixed="T)" gsub="" nbsp="" of="" p="" redirect="" term=""> ind <- df="" duplicate="" duplicated="" filter="" p="">
new.df <- df="" ind="" p="">
myh<-nchar 1="" gsub="" nbsp="" new.df="" p=""> #buckets
one<- length="" myh="=1])</p"> two<- length="" myh="=2])</p"> three<- length="" myh="=3])</p"> four<- length="" myh="=4])</p"> five<- length="" myh="=5])</p"> six<- length="" myh="=6])</p"> seven<- length="" myh="=7])</p"> eight<- length="" myh="=8])</p"> cvec <- c="" eight="" five="" four="" nbsp="" one="" p="" seven="" six="" three="" two="">
result.frame = as.data.frame(matrix(ncol=2, nrow=10))
names(result.frame) = c("Number", "Total")
# following is OK for now
result.frame = rbind(result.frame, c(1, one))
result.frame = rbind(result.frame, c(2, two))
result.frame = rbind(result.frame, c(3, three))
result.frame = rbind(result.frame, c(4, four))
result.frame = rbind(result.frame, c(5, five))
result.frame = rbind(result.frame, c(6, six))
result.frame = rbind(result.frame, c(7, seven))
result.frame = rbind(result.frame, c(8, eight))
plot(result.frame$Number,result.frame$Total,pch=19,col="blue" , xlab="Number of terms in a query" ,ylab="Total")
lines (result.frame$Number, result.frame$Total,lwd="4",col="red" )
lm1<-lm otal="" p="" result.frame="" umber=""> abline (lm1,lwd="4",col="green" )
}
-lm>->->->->->->->->->-nchar>->->->-read>-function>
Wednesday, May 15, 2013
Moving from FAST to SOLR: Project Updates
Current status: Phase I of the project was completed & system is already in production (With all complexities, we wrapped the project in record time.)
Positives: SOLR is fast in-terms of content indexing & searches. (Overall QPS is greater than FAST and current sizing issues are resolved.)
Challenges: 1) During implementation we noticed strong customizations around business rules. This functionality is not available in SOLR/Lucene. I did some domain specific customizations. 2) We are replacing a search product with decent relevancy (because of all business rules) & we started late with relevancy. Relevancy echo system includes fine-tuning of similarity algorithms (tf/idf, B25 etc.) plus fine-tuning of synonyms/ spell-check modules. SOLR synonyms/spell check modules need more improvements/core bug fixes. Again I did more customizations to meet the needs. 3) Dynamic range facets & site taxonomy traversal/updates need future work. Basic stuff is working. However if the taxonomy changes often, doing incremental updates is a complex issue. For now, we have a workaround in place. Some extent business rules stuff was invented to work around some of these problems. Map reduce & Graph DB frameworks seems to solve issues around dynamic range facets/dynamic taxonomies. Exploring simple integration approaches between Hadoop/SOLR.
Luck factor: Existing FAST based search was not completely leveraging FAST strong document processing capabilities(Linguistic normalization/ sentiment /Taxonomy etc). So we managed with little customizations around Lucene analyzers.
Tuesday, May 15, 2012
Lucene revolution conference 2012
Personally I like the following sessions because of the contents/presenters energy/passion behind the search technology.
Automata Invasion
Challenges in Maintaining a High Performance Search Engine Written in Java
Updateable Fields in Lucene and other Codec Applications
Solr 4: The SolrCloud Architecture
Also “Stump the Chump” questions are interesting & learned quite a bit.
I won small prizes too. In general, Lucid imagination uploads the
conference vedios at the following location. Keep watching.
I also missed few good sessions in Big data area.
http://www.lucidimagination.com/devzone/videos-podcasts/conference-videos
Friday, May 04, 2012
Mapping hierarchical data in to SOLR
While modeling hierarchical data in XML easy, (for example org charts, Bill of materials structures) mapping to a persistent storage is very challenging. Relational SQL manages it however fetching/ updating the hierarchies is very difficult. Even books are written on mapping Tree/Graphs in to RDBMS world.
Consider simple hierarchy list looks like this:
Satya
Saketh
Dhanvi
Venkata
Dhasa
The most common and familiar known method is adjacency model, and it usually works as every node knows the adjacency node. (In SOLR world, ID contains in the unique value & parent field contains parent. Assume for the root node it is null or same.)
In SOLR each row is a document:
SOLRID Name Parent
01 Satya NULL
02 Saketh 01
03 Dhanvi 01
04 Venkata 01
05 Dhasa 04
Latest SOLR(4.0 beta?) Join functionality gives you the full hierarchy (or any piece of it) very quickly and easily.
Example queries:
1) Give me complete hierarchical list: q= {!join from=id to=parent}id:*
2) Give me immediate childs of Satya: q={!join from=id to=parent}id:Satya
An example SOLR query component to pull other objects
public class ExampleComponent extends SearchComponent
{
public static final String COMPONENT_NAME = "example";
@Override
public void prepare(ResponseBuilder rb) throws IOException
{
}
@SuppressWarnings("unchecked")
@Override
public void process(ResponseBuilder rb) throws IOException
{
DocSlice slice = (DocSlice) rb.rsp.getValues().get("response");
SolrIndexReader reader = rb.req.getSearcher().getReader();
SolrDocumentList rl = new SolrDocumentList();
int docId=0;//// at this point consider only one rootid.
for (DocIterator it = slice.iterator(); it.hasNext(); ) {
docId = it.nextDoc();
Document doc = reader.document(docId);
String id = (String)doc.get("id");
String connections = (String)doc.get("contains");
System.out.println("\n id:"+id+" contains-->"+connections);
List<String> list = new ArrayList<String>();
list.add(id);//add rootid too. If we have joins in solr4.0
int pos = 0, end;
while ((end = connections.indexOf(',', pos)) >= 0) {
list.add(connections.substring(pos, end));
pos = end + 1;
}
BooleanQuery bq = new BooleanQuery();
Iterator<String> cIter = list.iterator();
while (cIter.hasNext()) {
String anExp = cIter.next();
TermQuery tq = new TermQuery(new Term("id",anExp));
bq.add(tq, BooleanClause.Occur.SHOULD);
}
SolrIndexSearcher searcher = rb.req.getSearcher();
DocListAndSet results = new DocListAndSet();
results.docList = searcher.getDocList(bq, null, null,0, 100,rb.getFieldFlags());
System.out.println("\n results.docList-->"+results.docList.size() );
rl.setNumFound(results.docList.size());
rb.rsp.getValues().remove("response");
rb.rsp.add("response", results.docList);
}
}
@Override
public String getDescription() {
return "Information";
}
@Override
public String getVersion() {
return "Solr gur";
}
@Override
public String getSourceId() {
return "Satya Solr Example";
}
@Override
public String getSource() {
return "$URL: }
@Override
public URL[] getDocs() {
return null;
}
}
Wednesday, March 28, 2012
Data format conversion in java
public static final String[] date_format_list = {
"yyyy-MM-dd'T'HH:mm:ss'Z'",
"yyyy-MM-dd'T'HH:mm:ss",
"yyyy-MM-dd",
"yyyy-MM-dd hh:mm:ss",
"yyyy-MM-dd HH:mm:ss",
"EEE MMM d hh:mm:ss z yyyy"
/// add your own format here
};
public static Date parseDate(String d) throws ParseException {
return parseInputWithFormats(d, date_format_list);
}
public static Date parseInputWithFormats(
String dateValue,
String[] formatList
) throws ParseException {
if (dateValue == null || formatList == null || formatList.length == 0) {
throw new IllegalArgumentException("dateValue is null");
}
if (dateValue.length() > 1
&& dateValue.startsWith("'")
&& dateValue.endsWith("'")
) {
dateValue = dateValue.substring(1, dateValue.length() - 1);
}
SimpleDateFormat dateParser = null;
for(int i=0;i < formatList.length;i++){
String format = (String) formatList[i];
if (dateParser == null) {
dateParser = new SimpleDateFormat(format, Locale.US);
} else {
dateParser.applyPattern(format);
}
try {
return dateParser.parse(dateValue);
} catch (ParseException pe) {
//pe.printStackTrace();
}
}
throw new ParseException("Unable to parse the input date " + dateValue, 0);
}
public static void main(String[] args) {
String fromDt="";
String nPattern = "yyyy-MM-dd'T'HH:mm:ss'Z'";
SimpleDateFormat sdf = new SimpleDateFormat(nPattern);
String currentValue="Fri Jul 22 04:22:14 CEST 2011";
try{
fromDt = sdf.format(parseDate(currentValue.toString() ) );
} catch (Exception e) {
System.out.print("\n Case1: date format exception"+e.getMessage()+ " SOLR currentValue:"+currentValue);
fromDt="";
}
System.out.println("Case1. date as str---"+fromDt);
currentValue="2011-07-21 21:22:14";
try{
fromDt = sdf.format(parseDate(currentValue.toString() ) );
} catch (Exception e) {
System.out.print("\n Case2: date format exception"+e.getMessage()+ " SOLR currentValue:"+currentValue);
fromDt="";
}
System.out.println("\n Cse2. date as str---"+fromDt);
}
Monday, March 26, 2012
Latest family picicture from Disney world.
Friday, March 23, 2012
Parsing complex Xsd file with Java code.
However this code is there for sanity check. Still this works fine.
(You need xsom.jar + relaxngDatatype.jar file. Google it. You will find the jars.)
public class XsdReader {
public static void main (String args[])
{
XsdReader rr = new XsdReader();
rr.parseSchema();
}
public void parseSchema()
{
File file = new File("D:\\tmp\\books.xsd");
try {
XSOMParser parser = new XSOMParser();
parser.parse(file);
XSSchemaSet sset = parser.getResult();
XSSchema mys = sset.getSchema(1);
Iterator itr = sset.iterateSchema();
while( itr.hasNext() ) {
XSSchema s = (XSSchema)itr.next();
System.out.println("Target namespace: "+s.getTargetNamespace());
XSComplexType ct = mys.getComplexType("books");
int ctr=0;
if ( ct != null){
Collection c = ct.getAttributeUses();
Iterator i = c.iterator();while(i.hasNext()){
XSAttributeDecl attributeDecl = i.next().getDecl();
System.out.print("ctr="+ctr++ +"name:"+ attributeDecl.getName());
System.out.print(" type: "+attributeDecl.getType());
System.out.println("");
}
}
Iterator jtr = s.iterateElementDecls();
while( jtr.hasNext() ) {
XSElementDecl e = (XSElementDecl)jtr.next();
System.out.print( e.getName() );
if( e.isAbstract() )
System.out.print(" (abstract)");
System.out.println();
}
}
}
catch (Exception exp) {
exp.printStackTrace(System.out);
}
}
}
Monday, February 06, 2012
Lucene Standard Analyzer vs. Lingpipe EnglishStop Tokenizer Analyzer
For some odd reason, I end up prototyping different analyzers for PLM space content vs 3ed party analyzers. (Basic need is which got better control on STOP words. At least based on my quick proto type, SOLR got easy constructs.)
Small sample code comparing both analyzers is included.
I did not see much difference for small input text.
public class AnalyzerTest {
private static Analyzer analyzer;
private static long perfTime = 0;
public static void main(String[] args) {
try {
analyzer = new StandardAnalyzer(org.apache.lucene.util.Version.LUCENE_34);
String str = "PLM technology refers to the group of software applications that create and manage the data that define a product and the process for building the product. Beyond just technology, PLM is a discipline that defines best practices for product definition, configuration management, change control, design release, and many other product and process-related procedures.";
perfTime -= System.currentTimeMillis();
displayTokensWithLuceneAnalyzer(analyzer, str);
perfTime += System.currentTimeMillis();
System.out.println("Lucene Analyzer: " + perfTime + " msecs.");
perfTime -= System.currentTimeMillis();
displayTokensWithLingpipeAnalyzer(str);
perfTime += System.currentTimeMillis();
System.out.println("Lingpipe Analyzer: " + perfTime + " msecs.");
} catch (IOException ie) {
System.out.println("IO Error " + ie.getMessage());
}
System.out.println("Time: " + perfTime + " msecs.");
System.out.println("Ended");
}
private static void displayTokensWithLingpipeAnalyzer(String text)
throws IOException {
System.out.println("Inside LingpipeAnalyzer ");
TokenizerFactory ieFactory
= IndoEuropeanTokenizerFactory.INSTANCE;
TokenizerFactory factory
= new EnglishStopTokenizerFactory(ieFactory);
// = new IndoEuropeanTokenizerFactory();
char[] cs =text.toCharArray();
Tokenizer tokenizer = factory.tokenizer(cs, 0, cs.length);
String[] tokens = tokenizer.tokenize();
for (int i = 0; i < tokens.length; i++)
System.out.println(tokens[i]);
System.out.println("Total no. of Tokens: " +tokens.length );
}
private static void displayTokensWithLuceneAnalyzer(Analyzer analyzer, String text)
throws IOException {
System.out.println("Inside LuceneAnalyzer ");
TokenStream tokenStream = analyzer.tokenStream("contents",new StringReader(text) );
OffsetAttribute offsetAttribute = tokenStream.getAttribute(OffsetAttribute.class);
CharTermAttribute charTermAttribute = tokenStream.getAttribute(CharTermAttribute.class);
int length=0;
while (tokenStream.incrementToken()) {
int startOffset = offsetAttribute.startOffset();
int endOffset = offsetAttribute.endOffset();
String term = charTermAttribute.toString();
System.out.println("term->"+term+ " start:"+startOffset+" end:"+endOffset);
length++;
}
System.out.println("Total no. of Tokens: " + length);
}
}
Tuesday, January 31, 2012
Java Set Operations
This post is not about federated search but I keep using bunch of set operations to compare the search results, comparing the unique doc id etc.
public class SetOperations {
public staticSet union(Set setA, Set setB) {
Settmp = new TreeSet (setA);
tmp.addAll(setB);
return tmp;
}
public staticSet intersection(Set setA, Set setB) {
Settmp = new TreeSet ();
for (T x : setA)
if (setB.contains(x))
tmp.add(x);
return tmp;
}
public staticSet difference(Set setA, Set setB) {
Settmp = new TreeSet (setA);
tmp.removeAll(setB);
return tmp;
}
public static void main(String[] args) {
SortedSets1= new TreeSet ();
s1.add("one");
s1.add("two");
SortedSets2= new TreeSet ();
s2.add("two");
s2.add("three");
s2.add("four");
SortedSetresult = (SortedSet ) union(s1,s2);
Iteratorit = result.iterator();
System.out.print("union result -->");
while (it.hasNext()) {
String value = it.next();
System.out.print(value+", ");
}
System.out.println("\n");
result = (SortedSet) intersection(s1,s2);
it = result.iterator();
System.out.print("intersection result-->");
while (it.hasNext()) {
String value = it.next();
System.out.print(value+ ", ");
}
System.out.println("\n");
result = (SortedSet) difference(s1,s2);
it = result.iterator();
System.out.print("difference result-->");
while (it.hasNext()) {
String value = it.next();
System.out.print(value+", ");
}
System.out.println("\n");
/*
SortedSeti1= new TreeSet ();
i1.add(new Integer("1"));
SortedSeti2= new TreeSet ();
i2.add(new Integer("2"));
SortedSetiresult = (SortedSet ) union(i1,i2);
Iteratoriit = iresult.iterator();
System.out.println("Integer union result");
while (iit.hasNext()) {
Integer value = iit.next();
System.out.println(value+",");
}
*/
}
}
Monday, October 24, 2011
XSL recursion sample
Here is the sample code.
<xsl:template name="printChildObjects">
<xsl:param name="inputString"/>
<xsl:param name="delimiter"/>
<xsl:choose>
<xsl:when test="contains($inputString, $delimiter)">
<xsl:variable name="aChild">
<xsl:value-of select="substring-before($inputString,$delimiter)"/>
</xsl:variable>
<xsl:element name="field">
<xsl:attribute name="name">
<xsl:text> childObject</xsl:text>
</xsl:attribute>
<xsl:value-of select="$aChild" />
</xsl:element>
<xsl:call-template name="printChildObjects">
<xsl:with-param name="inputString" select="substring-after($inputString,$delimiter)"/>
<xsl:with-param name="delimiter"
select="$delimiter"/>
</xsl:call-template>
</xsl:when>
<xsl:otherwise>
<xsl:choose>
<xsl:when test="$inputString != ''">
<xsl:element name="field">
<xsl:attribute name="name">
<xsl:text> childObject</xsl:text>
</xsl:attribute>
<xsl:value-of select="$inputString" /> </xsl:element>
</xsl:when>
<xsl:otherwise>
</xsl:otherwise>
</xsl:choose>
</xsl:otherwise>
</xsl:choose>
</xsl:template>
Thursday, July 28, 2011
xsl:text must not contain child elements
Often there is a requirement to print a variable value in output.
Following xsl code will display compilation errors.
Simple workaround is pass it to a template
<xsl:text>
<xsl:value-of select="$objType"/> </xsl:text>
<!-- <xsl:call-template name="varValue"> <xsl:with-param name="value" select="$objType" /> </xsl:call-template> -->
<xsl:template name="varValue">
<xsl:param name="value" />
<xsl:choose> <xsl:when test="$value = ''"> <xsl:text>n/a</xsl:text> </xsl:when> <xsl:otherwise> <xsl:value-of select="$value" /> </xsl:otherwise> </xsl:choose> </xsl:template>
Tuesday, May 24, 2011
Typical industry search input requirements/ patterns
Now I want to filter the list with different kind of input patterns (aka single field search) In particular accepting wildcards (*,? & + etc escaping is fun in java.)
sample code:
String str="*+PATAC+*";
Pattern pat=Pattern.compile(".*\\+*\\+.*");
Matcher matcher=pat.matcher(str);
boolean flag=matcher.find(); // true;
Logger.println("1) matcher result->"+flag);
if ( flag == true)
Logger.println("pattern found"+str);
str = "adjkfh+PATAC+ajdskfhhk";
matcher=pat.matcher(str);
flag=matcher.find(); // true;
Logger.println("2) matcher result->"+flag);
if ( flag == true)
Logger.println("pattern found"+str);
str = "PATAC";
matcher=pat.matcher(str);
flag=matcher.find(); // true;
Logger.println("3) matcher result->"+flag);
if ( flag == true)
Logger.println("pattern found"+str);
str = "adjkfh+PATAC+";
matcher=pat.matcher(str);
flag=matcher.find(); // true;
Logger.println("4) matcher result->"+flag);
if ( flag == true)
Logger.println("pattern found"+str);
str = "+PATAC+testingsuffixchars";
matcher=pat.matcher(str);
flag=matcher.find(); // true;
Logger.println("5) matcher result->"+flag);
if ( flag == true)
Logger.println("pattern found"+str);
Sample code to create SOLR document from CSV file
public class CsvToSolrDoc
{
public String columnName(int i)
{
//workarounds workarounds
if ( i == 0) return "id";
if ( i == 1) return "what ever you want as field name";
return null;
}
public void csvToXML(String inputFile, String outputFile) throws java.io.FileNotFoundException, java.io.IOException
{
BufferedReader br = new BufferedReader(new FileReader(inputFile));
StreamTokenizer st = new StreamTokenizer(br);
String line = null;
FileWriter fw = new FileWriter(outputFile);
// Write the XML declaration and the root element
fw.write("\n");
fw.write("\n"); \n");
while ((line = br.readLine()) != null)
{
String[] values = line.split(",");
fw.write("\n"); \n");
int i = 1;
for ( int j=0;j it is length; J++)
{
String colName = "field name=\""+columnName(j)+"\"";
fw.write("<" + colName + ">");
fw.write(values[j].trim());
fw.write( "\n");
}
fw.write("
}
// Now we're at the end of the file, so close the XML document,
// flush the buffer to disk, and close the newly-created file.
fw.write("
fw.flush();
fw.close();
}
public static void main(String argv[]) throws java.io.IOException
{
CsvToSolrDoc cp = new CsvToSolrDoc();
cp.csvToXML("c:\\tmp\\m2.csv", "c:\\tmp\\m2.xml");
}
SOLR project stories. Lack of SOLR post filter support
It iterates entire document sets. For large documents this sucks. Also we need to SOLR distributed capabilities. Also during computing the ACL fields, I tried to encode users names etc with Base64, URLEncoder.encode etc. For small set of strings, this is working Ok. But for large sets, it is a pain. Ultimately affecting the search performance.
Another blocker.
Encode/decoder tes code.
startTime = System.currentTimeMillis();
String inputText = "Hello#$#%^#^&world";
for (int i =0;i<50000;i++)
{
String baseString = i+ " "+inputText;
encodedText = URLEncoder.encode(baseString,"UTF-8");
decodedText = URLDecoder.decode(encodedText, "UTF-8");
}
endTime = System.currentTimeMillis();
elapsedTime = endTime - startTime;
System.out.println( "\n URLEncoder/decoder Elapsed Time = " + elapsedTime + "ms");
>>>>
Elapsed Time = 2246ms
Monday, November 08, 2010
Few Inspiring Quotes from The Great Ones The Transformative Power of a Mentor
Adapt the pace of nature: her secret is patience – Emerson
Watch your thoughts; they become words
Watch you words; they become actions
Watch your actions; they become habits
Watch you habits; they become character
Watch your character; it becomes your destiny. – Unknown
I hope I shall always posse’s firmness and virtue enough to maintain what I consider the most enviable of all titles, the character of an Honest man – George Washington
The person who makes a success of living is the one who see his goal steadily and aims for it unsercingly. That is dedication – DeMille
What we say and what we do
Ultimately comes back to us
So let us own our responsibility
Place it our hands And carry it with dignity and strength -Anzaldua
Friday, November 05, 2010
Simple SOLR example code to index & search metadata
protected URL solrUrl;
public CommonsHttpSolrServer solr = null;
public SolrServer startSolrServer() throws MalformedURLException,
SolrServerException
{
solr = new CommonsHttpSolrServer("http://localhost:8983/solr/");
solr.setParser(new XMLResponseParser());
return solr;
}
public TestQueryTool()
{
//will add more junk later
}
public static void main(String[] args)
{
TestQueryTool t = new TestQueryTool();
try
{
//1)start & work with existing SOLR instance
t.startSolrServer();
// 2)Now index content ( files later. metadata)
t.indexBOAsDocument("uid1","bolt", "0001");
t.indexBOAsDocument("uid2","nut", "0002");
//3)now perform search
t.performSearch("uid");
} catch (MalformedURLException e) {
// TODO Auto-generated catch block
e.printStackTrace();
} catch (SolrServerException e) {
// TODO Auto-generated catch block
e.printStackTrace();
}
}
private void indexBOAsDocument(String uid, String id, String name)throws SolrServerException
{
SolrInputDocument doc = new SolrInputDocument();
doc.addField("id", uid);
doc.addField("part_name", name);
doc.addField("part_id", id);
try {
solr.add(doc);
solr.commit();
} catch (IOException e) {
// TODO Auto-generated catch block
e.printStackTrace();
}
}
public void performSearch(String qstring) throws SolrServerException
{
SolrQuery query = new SolrQuery();
//query.setQuery( "*:*" );
query.setQuery("id:uid* OR part_name:nut");
query.addSortField( "part_name", SolrQuery.ORDER.desc);
QueryResponse response = solr.query(query);
SolrDocumentList docs = response.getResults();
info("docs size is ="+docs.size());
Iteratoriter =docs.iterator();
while (iter.hasNext()) {
SolrDocument resultDoc = iter.next();
String part_name = (String) resultDoc.getFieldValue("part_name");
String id = (String) resultDoc.getFieldValue("id"); //id is the uniqueKey field
// Now you got id, part_name. End of story
}
}
Monday, October 04, 2010
Nice little book "The Great Ones" by Ridgely Goldsborough
Following are simple code of conduct
1.Make a decision & commitment
2.Conceive and execute a plan
3.Take full responsibility
4.Embrace patience and temperance
5.Act with courage
6.Cultivate passion
7.Exercise discipline
8.Remain single minded
9.Demand integrity
10.Let go of past failure. ( mostly learn from them)
11.Pay the price
Thursday, January 14, 2010
Interesting free book “Best Kept Secrets of Peer Code Review”
Order your FREE book with FREE shipping.
Also you can view sample chapters. Some of the point are interesting.
if you are doing code reviews or senior developer, take some time to read it.
I ordered one & after receiving the book, I will update this blog.
Wednesday, December 30, 2009
Dhanvi & Saketh still from 2009 India trip

Tuesday, December 29, 2009
Long list of books waiting to be read - 1
Nearly decade back, I read this book.
Now I picked the latest 2010 (Hard Times) edition.
During my Boston to Los Angeles transition, I looked at this book for some guidance.
Similar to old editions, book sticks to core ideas. However presentation & reference material was changed a lot. I am totally surprised to the information like 100 of job listing sites, starting home business etc. To some extent I am not following the latest trends. I have a job which I love & I do with tons of commitment. Because of this reason, last ten years, I never entertained any carrier change. So I am not aware of changing trends in Job markets. It does not matter if you are looking for a job or not; still it is worth reading.
Topic “Thing schools never taught us about job hunting” is always my favorite.
An interesting quote from the book
He or she who gets hired is not necessarily the one who can do the job the best; but, the one who knows the most about to get hired. – Richard Lathrop
Tuesday, April 21, 2009
Interesting little book “ReWealth”
Published by McGraw-Hill
Last weekend picked it from library. Contains very good information & what it takes to attain real sustainability.
I am not done but very interesting tales & full of quality quotes.
If I find more time, I will post more complete review here.
Give a man a fish, and you’ll feed him for a day.
Teach a man to fish, and he’ll buy a funny hat.
Talk to a hungry man about fish, and you’re a consultant.
—Scott Adams
The nation behaves well if it treats the natural resources as assets which it must turn over to the next generation INCREASED . . . in value. (emphasis added)
—Theodore Roosevelt
We really do not know how [the economy] works. . .
The old models just are not working.
—Alan Greenspan, former chairperson of the U.S.
If we become rich or poor as a nation, it’s because of water.
—Sunita Narain
Brief Kiva two years update.
Monday, July 21, 2008
Birthday video
Thursday, July 03, 2008
Eclipse Ganymede C/C++ CDT IDE platform & my first impressions
1.Make sure you have a stable JDK on your machine ( in my case, I am using JDK 1.5
2.Download MinGW from http://www.mingw.org/
& install (make sure enabling g++ installtion option)&
update your system PATH setting
After step 2, in your windows command prompt
if you type g++ -v it will display the following.
>>>
C:\a_eclipse_c_c++\eclipse-cpp-ganymede-win32\eclipse>g++ -v
Reading specs from C:/MinGW/bin/../lib/gcc/mingw32/3.4.5/specs etc. etc.
>>
3. Download eclipse & CDT environment from the following location.
http://www.eclipse.org/cdt/index.php
4. After installing eclipse, start eclipse.
5. Start with File->new->project->c++ give default name etc.
6. At this stage you are ready go with new a class or import some existing file.
7. If you are experienced user with Eclipse IDE(i.e. for java code) building & running the code is very easy. Else your have play with Run-> new configuration.
Unfortunately Eclipse-> Help Content sucks.
I finished from step(2) to step(7) in less than 30 minutes. I wrote few c & c++ programs & everything looking good. More later.
Friday, February 15, 2008
Integrate Full Text search functionality in to your applications with Lucene (searching)
public List search(){
List searchResult = new ArrayList();
IndexSearcher indexSearcher = null;
a) Get Indexer
b) Build default query parser
( since it was demo. Accept all wild cards characters in your inputs.
c) Now search the indexes.
( we already built them in the earlier step)
Hits hits = indexer.search(query);
My conclusions are
Lucene is a Pure Java product that provides:
* ranked searching ; best results returned first{ i need to test more here}
* Good numbers of query types:
phrase queries, wildcard queries, proximity queries, range queries etc
* fielded searching (e.g., title, path, contents)
* date-range searching
* sorting by any field
* multiple-index searching ( I am working on this one right now)
* allows simultaneous update and searching
I am looking forward to C/C++ implementation.
Wednesday, January 23, 2008
Integrate Full Text search functionality in to your applications with Lucene (indexing)
Any Full Text search functionality involves indexing the data first. Lucene was no different in this approach. By indexing your data, it can perform high-performance full-text searching very fast. I did indexed 17,000 html files (my product documentation) in less than 5 minutes.
Creating Index writer & adding documents methods are key.
Rest f the methods for book keeping.
Following code indexes html, htm files in a folder. (It recursively iterates the nested folders & indexes each file)
////you data
public static final String dataDir = "D:\\webapps\\help";
//the directory that is used to store lucene index
private final String indexDir = "D:\\help_index";
public static String src1 = "";
public IndexWriter indexWriter;
public static int numF;
public static int numD;
public void openIndexWriter()throws IOException
{
Directory fsDirectory = FSDirectory.getDirectory(indexDir);
Analyzer analyzer = new StandardAnalyzer();
indexWriter = new IndexWriter(fsDirectory, true, analyzer);
indexWriter.setWriteLockTimeout(IndexWriter.WRITE_LOCK_TIMEOUT * 100 );
}
public void closeIndexWriter()throws IOException
{
indexWriter.optimize();
indexWriter.close();
}
public void indexFiles(String strPath) throws IOException
{
File src = new File(strPath);
if (src.isDirectory())
{
numD++;
String list[] = src.list();
try
{
for (int i = 0; i < list.length; i++)
{
src1 = src.getAbsolutePath() + File.separatorChar + list[i];
File file = new File(src1);
/*
* Try check like read/write access check etc.
*/
if ( file.isDirectory() )indexFiles(src1);
else
{
numF++;
if(src1.endsWith(".html") src1.endsWith(".htm")){
addDocument(src1, indexWriter);
}
}
}
}catch(java.lang.NullPointerException e){}
}
}
public boolean createIndex() throws IOException{
if(true == ifIndexExist()){
return true;
}
File dir = new File(dataDir);
if(!dir.exists()){
return false;
}
File[] htmls = dir.listFiles();
Directory fsDirectory = FSDirectory.getDirectory(indexDir);
Analyzer analyzer = new StandardAnalyzer();
IndexWriter indexWriter = new IndexWriter(fsDirectory, analyzer, true);
for(int i = 0; i < htmls.length; i++){
String htmlPath = htmls[i].getAbsolutePath();
if(htmlPath.endsWith(".html") htmlPath.endsWith(".htm")){
addDocument(htmlPath, indexWriter);
}
}
return true;
}
/**
* Add one document to the lucene index
*/
public void addDocument(String htmlPath, IndexWriter indexWriter){
//System.out.println("\n adding file to index "+htmlPath );
HTMLDocParser htmlParser = new HTMLDocParser(htmlPath);
String path = htmlParser.getPath();
String title = htmlParser.getTitle();
Reader content = htmlParser.getContent();
Document document = new Document();
document.add(new Field("path",path,Field.Store.YES,Field.Index.NO));
document.add(new Field("title",title,Field.Store.YES,Field.Index.TOKENIZED));
document.add(new Field("content",content));
try {
indexWriter.addDocument(document);
} catch (IOException e) {
e.printStackTrace();
}
}
im.openIndexWriter();
File src = new File(dataDir);
if(!src.exists()){
System.out.println("\n DATA DIR DOES NOT EXISTS" );
return;
}
long start = System.currentTimeMillis();
System.out.println("\n INDEXING STARTED" );
im.indexFiles(dataDir);
im.closeIndexWriter();
long end = System.currentTimeMillis();
long diff = (end-start)/1000;
System.out.println("\n Time consumed in Index the whole help=" +diff );
System.out.println("Number of files :\t"+numF);
System.out.println("Number of dirs :\t"+numD);
}
Friday, January 11, 2008
Populating MySql table from MS Excel { aka .csv} file
{I felt Apache, PHP & MySQL combination fits their need.
I will explain about that application little later.}
I have already received some data in MS Excel file.
Strangely some trailing columns are missing in some records after saving Excel file in to csv file.
Following command fails saying column truncated MySQL errors.
mysql> load data infile 'C://bea//temple//dpexport.csv' into table donar
_info_4 fields terminated by ',' OPTIONALLY ENCLOSED BY '"' Lines terminated by
'\n';
After spend little more time with MySQL documentation,
I found out that IGNORE option does the magic &
I am able to load the csv files in my SQL table. Just add IGNORE next to infile.
Wednesday, January 02, 2008
Amazon Kindle is the next IPOD?
Suddenly a young man sat next to me, seriously browsing web with his Amazon Kindle. (What made me curious was, he was reading one of my favorite web site, New York Times.) As I was paying attention to gadget , slowly he started talking all the good things about his new gadget & offered me to feel it. Quickly I checked the weight & look and feel of a blog & an e-book. Not so heavy & look and feel was very good & natural. I liked it a lot. I was so tempted after coming home, I checked Amazon website for Kindle.
(Little pricy, It was sold out & seems to be there is lot of demand for the gadget) Most of the reviews are positive. This incident remembers me another old incident with first experience with IPOD. Nearly4+ year's back, while coming back from my vacation, (India->London-> US), one person sitting next to me was explaining & talking & proud of owning a new IPOD. (If I remember correctly, it was the second month after IPOD launch.) Now I am seeing the same thing happening for Amazon's Kindle. I liked the way it was designed (automatic download content to the device)& convenience & thought process behind the Kindle. I was so surprised that Amazon came up with this kind of device. {After its initial launch as a major on line book store, this was the best thing from Amazon. My personal opinion. } Hoping that I will own Amazon Kindle very soon. It fits in to my taste.
Wednesday, December 19, 2007
An update & my progress in Kiva
Following is my kiva lender page http://www.kiva.org/lender/dhasa
Friday, November 16, 2007
Java DefaultListModel performance issues
Recently I received a big customer escalation on search domain.
Basically end user was searching for enterprise information based on end user input criteria.
In our java rich client, we are showing a simple dialog to select the list of users in the enterprise. This action was consuming 10+ minutes.
Real culprit was UI works fine for simple 100 to 1000 users.
However customer is testing with 10K plus users.
After analyzing the all the code at server side & finally I looked in the client layer.
At client, server data is getting added to DefaultListModel with addElement() in a for loop.
Real culprit is addElement() method.
After seeing the implementation of the above method & its sequence of event calls &
Little bit of browsing the java forums, I found out that we should use the above class for large lists. Yes never use DefaultListModel directly. Still this problem exists in JDK 1.5 version. There are multiple solutions to this problem. Just Google it. You will find many.
I made a quick fix based on some suns forum advice. (Basically it is fast & I am seeing 90% improvement)
Steps:
1) Extend your DefaultListModel as shown below
class FastListModel extends DefaultListModel { private boolean listenersEnabled = true;
public boolean getListenersEnabled() { return listenersEnabled; }
public void setListenersEnabled(boolean enabled) { listenersEnabled = enabled; } public void fireIntervalAdded(Object source, int index0, int index1) { if (getListenersEnabled()) { super.fireIntervalAdded(source, index0, index1); } }
}
2) Add list listener to your list model
ListDataListener listener = new ListDataListener() { public void intervalAdded(ListDataEvent e) { list.ensureIndexIsVisible(e.getIndex1()); } public void intervalRemoved(ListDataEvent e) { } public void contentsChanged(ListDataEvent e) { } }; model.addListDataListener(listener);
3) Turn on & off listener explicitly
model.setListenersEnabled(false);
//add content for(int i = 0; i <>
// now enable the listers
model.setListenersEnabled(true);
Wednesday, October 03, 2007
New Java Runtime methods
Java Runtime methods like maxMemory(), freeMemory() totalmemory() etc. to know your application or your module memory usage. It always helps.
Also use availableProcessors () elegantly if you application is spanning too many threads.
Thursday, September 06, 2007
hooray XSLT now part of JDK 1.5
(Thanks god, we don’t have to download xalan, xerces etc.
& setting big class paths to author style sheets.)
Minus points are Java implementation of xalan sucks badly.
It is working fine for simple transformation use cases &
but failing for the complex cases.
(I will write those test cases in the next post)
Following is the sample Test.java to transform the input xml in to HTML ( or another other output) using java Xalan libs.
import javax.xml.parsers.DocumentBuilderFactory;
import javax.xml.transform.Source;
import javax.xml.transform.Transformer;
import javax.xml.transform.TransformerFactory;
import javax.xml.transform.dom.DOMSource;
import javax.xml.transform.stream.StreamResult;
import javax.xml.transform.stream.StreamSource;
import org.w3c.dom.Document;
import java.io.FileInputStream;
import java.io.FileOutputStream;
public class Test
{
public static void main(String[] args) throws Exception
{
Source source = new StreamSource(new FileInputStream("C://AE_html.xsl"));
Transformer t = TransformerFactory.newInstance().newTransformer(source);
Document doc = DocumentBuilderFactory.newInstance().newDocumentBuilder().parse(new FileInputStream("c://users.xml"));
System.out.println("Transforming...");
t.transform(new DOMSource(doc),new StreamResult ( new FileOutputStream("C://users.html")) ); }
}
//System.out.println( "doc as text content"+ doc.getTextContent() );
//t.transform(new DOMSource(doc), new StreamResult(System.out));
Thursday, August 30, 2007
Tuesday, August 28, 2007
Monday, August 27, 2007
Monday, June 04, 2007
More online references
Scholarpedia
www.scholarpedia.org
Conservapedia
www.conservapedia.org
Citizendium
www.citizendium.org
Wednesday, April 11, 2007
Became a member in kiva
But following program was really inspiring. (It is a small part in Frontline)
(A Little Goes a Long Way)
Watch it online first.
http://www.pbs.org/frontlineworld/stories/uganda601/
Immediately decided to be a member in kiva.
I believe in the concept (microfinance & direct lending & to the needy.)
I watched many programs on microfinance.
But I never know how to be part of it or Can an individual can join.
www.kava.org helped me to contribute.
At present I am thinking of contributing to the same for another one year.
Started with 6 & my goal is help 50 all the time.
Please read the FAQ etc in the www.kiva.org before jumping.
“It's a new, direct and sustainable way to fight global poverty, and the way I see it, I get a higher return on $25 helping someone build a future than the interest my checking account pays. “
Following is my lender page in kiva.
http://www.kiva.org/lender/dhasa
Wednesday, November 29, 2006
Sunday, May 21, 2006
Tuesday, March 21, 2006
Saketh's first birthday invitation card
Friday, February 17, 2006
Another Interesting Quote
—Unknown
Tuesday, February 14, 2006
Monday, February 13, 2006
Funny but real
—Scott Adams
Monday, February 06, 2006
Quotes of the Day
Swami Vivekananda
There is no security on this earth; there is only opportunity
Science without religion is lame, religion without science is blind.
Albert Einstein
Sunday, February 05, 2006
Free & simple web picture software
Saturday, February 04, 2006
My Take on movie 'Heat"
Two great actors (Al Pacino & Robert Niro) acted quiet brilliantly.
It is a story about two people on the opposite of law.
Al Pacino as cop and Niro as the bad guy.
Movie starts with Niro & his team steeling some bonds from a firm in day light in LA. Next scene onwards Al Pacino investigation etc.
Director(Michael Mann) portrayed both characters quiet deeply.
The best part is confrontation between Al Pacino & Robert Niro.
I expected an alternate ending. (Let Niro go free or escape.)
A good & worth watching thriller movie.
This movie goes to my list of all time favorite thriller movies category.
Tuesday, January 17, 2006
My Take on movie “The insider (1999)”
Two great actors (Al Pacino & Russell Crowe) acted quiet brilliantly.
It is based on true story about a whistleblower who knows too much about inside of tobacco industry. Al Pacino acted as CBS 60 Minutes producer who will try to air whistleblower story on the TV at any cost. Whole movie is about these two people struggle to air the truth.
Again it is not an action movie but drama is so intense.
A good & worth watching movie.
Wednesday, January 11, 2006
Quotes of the Day
Men marry because they are tired; Women because they are curious. Finally both are disappointed.
Of there is a wrong way to do something, then someone will do it.
Tuesday, January 10, 2006
Lot of free photo prints
Use code "30UP-ROMI" to get additional 30 ( 4X6) prints.
Ordered nearly 100 photos for $10.
Monday, January 09, 2006
My take on “The Bourne Identity” movie.
This weekend both kids are sick , I stayed at home & sometimes working.
I watched The Bourne Identity and few kid’s Pooh movies. Just to give company to Dhanvi.
Coming to movie, this is a very neat and intelligent movie about a CIA agent. Everything perfect, no flaws really. Director showed Actor Matt Damon in a different dimension.
Movie starts with a body floating in sea water, with two bullets and lost memory.
As movie progresses, it is all about his past and identity and his life. Action sequences are quiet good and feel quiet natural and script is fast paced. No dull movement at all.
I regret not watch this movie so far. I heard it is good but never thought this good.
100% awesome movie for Spy/Action/Thriller fans.
Sunday, January 08, 2006
My take on Dell Inspiron 600m Laptop
Couple of weeks back, I received my new Dell inspiron laptop 600m from Dell. Configuration is bit powerful. (I took it because I have big plans for video editing. Also it has 1 GB Ram with DVD R/W burner. CPU is 1.7 GHz mobile Pentium.)
I used it & like it lot. As usual I check CNET editor/user reviews before buying any gadgets. (I don’t know any other big user community.)
As far as laptop top is concerned, it is good. I love it a lot. It is light weight, thin laptop.
Also it is inexpensive compared to any other brands for the same configuration. (I paid $650 for this little machine.) Battery life is pretty good. (Nearly 3 hours.)
Lot of people complains about this. But I am very happy for batter life. I don’t know why anyone wants more than this. It came with NEC DVD burner. It is dam good. Working flawlessly. See my other post to see how I am burning DVDs.
The only negative point I see is, bottom left part of the laptop is pretty warm after couple of hours + usage. Seems to be hard drive is getting warm. For me it is OK. It is very rare for me to open laptop for couple of hours at home with kids playing around me.
Also I need to check inbuilt wireless and its performance. Sometimes it disconnects without any error messages. I need to check with dell fourms/support on this.
Other than couple of minor issues, if you want a nice, powerful laptop with in budget, I will definitely recommend dell 600m.
Friday, January 06, 2006
Dhanvi like number “3”
Last night when mama is giving some biscuits to Dhanvi, he is insisting I want 3.
In fact I observed this for quiet some time. He is always asking for quantity 3
I got little curiosity and asked why you like number “3”
Guess his response
.
.
.
.
.
.
.
.
.
.
because I am 3 years old.
Thursday, January 05, 2006
Free Excel viewer
Try the above link. It is free.
It is the best option, if MS excel software is not installed on your machine.
Again it will not allow you to author excel documents.
Crazy Barnes & noble
Guess what, yesterday I received 11 different boxes from bn.com via ups. I was shocked seeing so many boxes. I don’t know why they did like that. Sometimes for multiple copies of the same item, they shipped the same item in different packages. It is truly stupid. If you to go US post office to mail those packages, usps will charge more then $5 per package. My order worth is $50 + No shipping. Assume bn gets deep discount(50%) from ups, still bn pays more than $25 to shipping only. Boxes, package material, laser printer based invoices and return slips consume the rest the of $25 not considering labor costs etc. I was so amazed the way bn online store business works. Instead of selling like this, they can donate to some schools or libraries for tax write off, it is much better & profit to bn core business. I feel this not the way to run a business. What do you think?
Again Barnes & Noble is my favorite store. Every week, I will visit at least once.
My comments are about their online order process system only.
If you are books lover, buy top 100 bargains from the following link.
http://www.barnesandnoble.com/saleannex/top100_cds2.asp?PID=4030&start=1&userid=yS36ilqCNH&cds2Pid=4030
Tail piece: I opened the all boxes one after the other to check the books. Dhanvi watched all the story, activity books once and put them one side and started playing with the boxes for whole two hours. (Assuming them as boats, brick etc.) One of my frind used to joke, if you take birthday gift to kids, kids play with the gift grip wrap more than the original toy. Same case with my kid too. Hoping that he will enjoy his new books, activity stuff etc.
Wednesday, January 04, 2006
Quotes of the Day
-John Quincy Adams (unverified)
It is better 100 guilty persons should escape than one innocent person should suffer.
-Chanakya Indian politician, strategist and writer
Since 1996 onwards I am using my e-mail.
I received so many e-mails with lot of their favorite quotes.
From here onwards, I will update my blog with all of my favorite quotes collection.
(It is huge and good collection (?).)
Tuesday, January 03, 2006
My way of DVD buring
that I don't need
- Remove all the other crap I don't want with VobBlanker
- Fix the menus with MenuEdit if you want
- Burn with Nero
Most of the software is freeware & Nero software is available from www.download.com
(Trail version.)
I will add more links later.
Perfect & innovative movie "Memento" 2000
Just try to tell the story for 10 mins. Then go back to another 10 minitues for what happened before and so on. This idea struck me after watching pulp fiction movie. But somebody else got the same idea & made a good thriller movie “Memento”. Plot is very simple. A revenge story. But as explained earlier, movie shows ending scene first and goes back wards. Perfect & innovative way of telling the story.