Tag Archives: Idea

Weight your toys

I decide to write this post because this post reminds something about my weakness when dealing with algorithm problems.

Here’s the problem, I put it to gist.


How to abstract reality problems to sort problems, and how to take advantage of sort to find out things that interest us?

The code above, I want to use observer design pattern ( compare_event ). Yes that’s the part to emphasise that we always want to search things interest us, if not, why we search things? So that is the pattern of human beings, and I always decouple things when I find a point that fit normal brains.

Here is the steps that I always follow to create a solution:

  1. Understand them, feel them, and try to measure them. We always have a way to quantify things, though most people die here or find a really bad way to measure things.
  2. So we have convert states to numbers in the first step. We can use the numbers to compare two things.
  3. If I want to search something, I always search things while sort them. You always do the two things together, think about it when you are paring the socks from your washing machine. Even school child does it without thinking.
  4. I interpolate tasks while each comparison. Searching itself is also a task. Each task can access the current comparison result and process state.
  5. Choose a sort algorithm which fit the situation best
  6. Launch the whole process.


Git Daemon Two Common Mistakes

This is a powerful function introduced since git 1.7. The git will listen to a port and receive  incoming git protocol request. When we need to auto deploy project inside a private network, it is one of the easiest way to work with. It doesn’t depend on http or ssh, and much faster. Though it doesn’t support user authentication, it still a good solution for sharing code.

Two common mistake that the git documentation doesn’t mention:

  1. A white list path must be set.
    Something like this won’t work: 
  2. The white list path should be an absolute path or ‘./‘ prefixed.
    Something like below won’t work, the  test_repo should be prefixed:   ./test_repo , or use absolute path 

A simple correct example should be like below.

Then access your server by git push git://server_addr:8123/test_repo .

I wrote a simple shell function to help setup:


Thoughts on naming convention

I heard a lot of voices saying that they want to make their coding style the same with the corresponding frameworks, so they study the ancestors’ codes and mimic the so called golden standards.

Keeping coding style the same with the framework is no wrong. But don’t lost your own voice when you are trying to follow other’s. 

I prefer my coding style being different from the framework that I’m using. One big advantage is that it will be much easier to distinguish which part of the codes is mine and which belongs to the API. For example, a mix of camelCase and underscore conventions:

If a new member comes to read this java snippet, he will find it useful to find the road map of this project. Sometimes I think a proper way to mix will make life easier, isn’t it?

Collective Phenomena

Inspired by this post, I wrote a CoffeeScript implementation. This is a very interesting phenomena that revealed the beautiful philosophy behinds the nature, a slight push to the collection of simplicity can make miracles : D

As you can see, each particle is just moving around on a circle orbit with the same velocity, only the initial position is different. However, when they are made a whole, things begin to be interesting.

Continue reading

Some thoughts on the software’s auto-update design pattern

There are so many ways to get the program to have the ability of auto-update. But as I’ve known, there is no way can absolutely make sure the program will update itself as expected. For example, you have defined some data type, in your program and store the data locally. And you have happily distributed your product to thousands users. Then months pass, you add some details to the basic data type. You migrate all the user data properly, and everything gone fine. Not a single complain heard from you users. Things turned complicate when the third time you update your data type. Now you have 3 types of data. Some of your users may be still the oldest version. Fortunately, you have left a tag in the data type to mark the version of the data type. So you take your time to check the version, and implement 2 method to migrate the data.

How about if this pattern continues for around 10 times? How many migration methods you should implement? Nine, of course. But if for some reason some of your users have reversed the version to an old one? From the basic principles of combination, we need 10×9 which means 90 methods to meet the requirement. That sounds crazy.

The main problem is because you can’t be sure whether all the users have always updated to the newest version. In other words, you need to find a solution to sync all the users program version and don’t hurt their data. On one can predict the future, we never know how the data finally like.

To be continued…

MSV Application Architecture

I think the general MVC architecture is not perfect enough for a complex dynamic application. The most confusing part of the MVC is it’s Controller. It receives the user sent data, sends commands to change the state of the Model, and also takes the responsibility to change the presentation of the view.

I think it could be simpler:

  • Model
  • Service
  • View

To separate each component’s responsibilities, and make the architecture more open. I replaced the Controller with the Service. The Service only handles the data that the View requests, such as REST, or sends commands to change the Model’s state. All UI logic is moved to the View, and all View requests are async based. So that we can take the full control of the representing flow the application. The front-end developers will only need the client knowledge and skills to construct the dynamic content of the application.

The disadvantages of this architecture are obvious. The client should at least supply a programmable interface. Such as the HTML based web application, the browser should support javascript and the javascript is enabled. And looser coupling means lower performance.

The better way to combine hashcodes

For example, two int numbers x and y. A simple way is like below:

But it’ll be easily result in collision. To avoid this, Joshua Bloch performed an elegant way to combine two hashcodes:

The usage of the prime numbers are the most brilliant part of the algorithm.

Smart method to get similar name collection

Lots of programs need a function like this: get the all the similar file names in a folder, or get all the files than contain the similar content. Such as media player, image browser. Even comes to the arrangement work with the files on a server.

Here’s the things coming from my mind.

  1. Levenshtein distance
  2. Eigen value of a string
  3. Vectorize a string

Levenshtein distance is easy to use and really fast on runtime, but it’s a bad idea to directly cache the results. At last we need to compute all the strings every time, and it just take away more cpu time. Vectorize a string may be more difficult, but we can cache the eigen values, then time will be saved, and it won’t take up too much storage.

It’s up to the project which one we’ll choose.