India is the land of arranged marriages and the protagonist story had recently met five prospective brides who were equally eligible and evenly matched. All the girls have at least one unique attribute which he wanted his spouse to posses and each of these attribute was equally important to him; one of them was very beautiful; another was highly educated; another had a good sense of humour and so on. It was difficult for him to choose one girl over another. The protagonist met a data scientist, to discuss his problem of choices.

Data Scientist: This is similar to social choice theory, a framework for weighting individual interests, values, or welfares as an aggregate towards collective decision using symbolic logic. Let’s make an algorithm to evaluate the social compatibility between people. Then we will use it to find your best match form your prospective brides.

Protagonist: Why bother about evaluating social compatibility?

DS: Do you realize its immense business potential? The top two Indian matrimonial sites draw 2 million visitors accessing over 15 million pages daily. If these two websites implement the algorithm then assuming that only 5% of the visitors actually use it, you have a 100,000 daily user in India alone. In the future, if the top matrimonial and dating websites across world implement the algorithm then you can hit half a million daily users. On a pay per use or fixed month rate revenue model, look at the expected income.

P: So how can we quantify compatibility?

DS: A simple approach is to rank the girls in order of each attribute and then combine the individual ranks into a composite rank using the known methods of combining ranks. The top composite ranked girl is your first choice.

DS: Unfortunately a ranking based approach is conceptually flawed. Economics Nobel Laureate Dr. Kenneth Arrow proved the Arrow’s Impossibility Theorem, a pioneering theorem of social choice theory, which states that no rank-order voting system satisfies all fairness criterions. Moreover for critical social decisions psychology could prevail over statistics. The ideal methodology should be able to quantify the physiological aspect of human behaviour.

DS: Ideally you would want all the desired attributes of a dream spouse in one person. But in reality, the desired attributes will be distributed across different girls. So you have to give up on one attribute to gain on another. Thus the attributes are competing against each other so you have to make competitive choices.

DS: Assume that you have a total of twenty points to allocate across the attributes. How much are you willing to give up on the beauty to gain on the educational qualification of your spouse? If you allocate 15 points to beauty, you have only 5 points to allocate to education. When faced with scarce resources (points) you will be much more judicious in spending. Hence competitive choices are a better quantification of your actual psychological preference.

P: And how do we quantify competitive choices?

DS: By using conjoint analysis. It is based on mathematical psychology and is widely used in psychophysics, perception, decision-making and the quantitative analysis of behaviour. I will create a social compatibility algorithm and use your data to see what your actual psychological preferences; then we will find your most suitable match.

P: Really? You can build such a algorithm?

DS: Rest assured; I have learned conjoint analysis from one of the pioneers of the subject, Dr. V. Srinivasan. Give me two days.

(Two days later)

DS: The social compatibility algorithm is ready; and based on your competitive choices, it suggests that your most suitable match is the second girl. Hmmm … she is a teacher but you didn’t tell me what she teaches?

P: Well, she teaches statistics in a college.

DS: Statistics! I knew the algorithm was right.

Claimer: Based on a true incident. Both the protagonist and the data scientist work in the analytics industry. The protagonist and the lady statistician are now seeing each other.

Entry 1. Let f(r) be any divergent series of positive terms, q_{r,k} be the r^{th} k-power free number, \zeta(k) be the Riemann Zeta function. Let S_f(r) = \sum_{i=1}^{r} f(i). If  g(x) is Riemann integrable in (0, \infty) then,

\displaystyle{ \sum_{r=1}^{n} f(q_{r,k}) g \Big(\frac{S_f(r)}{\zeta(k)}\Big) \sim \int_{f(1)}^{\frac{S_f(n)}{\zeta(k)}} g(x)dx.}

Corollary 1. As k \rightarrow \infty, \zeta(k) \rightarrow 1. Also every natural number is k-power free when k \rightarrow \infty. Hence the above result reduces to

 \displaystyle{ \sum_{r=1}^{n} f(r) g(S_f(r))\sim \int_{f(1)}^{S_f(n)} g(x)dx}.

Entry 2. Further let q'_{k,n} be the n^{th} k-power containing number and f be any function Riemann integrable in (1,\infty); then,

\displaystyle{ \sum_{k=2}^{\infty}\frac{1}{k} \Big\{\frac{f(q'_{k,1}) + f(q'_{k,2}) + f(q'_{k,3}) +\ldots}{f(q_{k,1}) + f(a_{k,2}) + f(q_{k,3}) +\ldots}\Big\}= 1 - \gamma}.

The first documented work on regression 1 2 was published in the year 1898 by Schuster and since then, several regression models have been proposed. Regression is a active area of research because of the wide spread use of regression analysis in scientific, statistical, industrial and commercial applications. Ideally we would want a regression method which gives a perfect between the regression curve and the actual values of the data points. However the existing regression methods suffer form two major drawbacks:

(i) A regression method may be suitable for one type of data and unsuitable for another type of data. For example ordinary linear square is suitable for linear data but data in the real world are not necessarily linear therefore linear regression would not a good choice if the data is not roughly linear.

(ii) Unless there is a perfect match between the actual values of the data and the values given by the regression model, there will always be an error. 

We present a new method of regression which works for all type of data; linear, polynomial, logarithmic or erratic random data that shows no particular trend. Our method is an iterative process based the application of sinusoidal series analysis in non linear least square. Each iteration of this method reduces the sum of the square of the residuals and therefore by successive iteration of this method, we show that every finite set of co-planar points can be expanded as a sinusoidal series in in_nitely many ways. In other words, given a set of co-planar points, we can fit infinitely many curves that pass through all these points. By setting a convergence criteria in terms of acceptable an error we can stop the iteration after a finite number of steps. Thus in the limiting case, we obtain a function that gives a perfect fit for the data points.

The regression method is published in ArXiv: Click here

Kanakanahalli Ramachandra (1933-2011) was perhaps the real successor of Srinivas Ramanujan in contemporary Indian mathematics. There would be no exaggeration in saying that without the efforts of Ramachandra, analytical number theory could have been extinct in India back in the mid 1970s. I was fortunate to get the opportunity to learn mathematics from the master himself. ‘On the half line: K. Ramachandra‘ is a short biography on the life and works of K. Ramachandra.

The complete article in PDF format is published in Vol 21, September 2011 issue of the Mathematics Newsletter, Ramanujan Mathematical Society. Click here to read the full article: MNL-Sep11-CRC

Some photographs of K. Ramachandra

Continuing with the previous post in this topic, a stronger form of conjecture on the lower bound of Corollary 1 is as follows

ConjectureLet p_n be the n-th prime. The for n \ge 32,

\displaystyle{p_n^{\frac{1}{n}} > \Big(1+\frac{1}{n^2}\Big) p_{n+1}^{\frac{1}{n+1}} }.

The above conjecture implies that for all sufficiently large n ,

p_{n+1} - p_n < (\ln p_n - 1)(\ln p_n - \ln\ln n).

Prof. Marek Wolf, Institute of Theoretical Physics, Wroclaw, Poland has verified the above conjecture for primes up to 2^{44}.

Blog Stats

  • 10,075 hits

Enter your email id to follow this blog.

Join 17 other followers


Get every new post delivered to your Inbox.