2006 Extravaganza of Predictions  I think you'll find I'm really going out on a limb, here!
Predictions for 2006  Part 1, Predictive Modeling
The 52week model for the S&P 500 is currently reading in the 6th decile. The 6th decile is neutral in the sense that it suggests an average 52week return. Median and average returns for this reading are 13.4% and 11.7% vs. standard median and average returns of 12.4% and 11.8%. The 6th decile implies a 79% chance of gains, with about a 52% chance of aboveaverage gains, and a stillhefty 13% chance of ending the year down more than 10% from where we are today. Overall, the model output suggests a U.S. stock market year that is somewhat typical and not anything to be especially excited about or afraid of.
The model was built with a weekly data set going all the way back to November of 1981, and over that time period the S&P 500 had a median and average gain of 12.4% and 11.8%, with 79.4% of years ending in a gain. Generally, I run the 13week and 52week models every Friday, compare them to the 21day output from the EPCR model, and then also look at some other technical indicators to get a feel for what the market is going to do. Based on this combination of indicators, I would think we have a slow start to the year, a good entry point coming up soon, and overall good performance compared to 2005 by the time we look back at 2006. However, this update is all about looking a year in advance.
I revamped the models in early November in order to account for the discontinuation of M3 from the Federal Reserve data. The new model includes different technicals from the S&P 500 itself as well as a larger array of fundamental data. If your eyes glazed over during college mathematics classes, you might as well skip down to the parts where I talk about specific stocks.
My intent isn't to explicate the exact nature of the model of I use. I want to explain how it was developed; what inputs it uses; and to provide enough information so that any reasonably savvy math nerd could devise their own model, that was just as good or maybe better. Another important point is that selection of independent variables plays a major role in any data mining project. As one of my former bosses was fond of saying "there is no such thing as a truly independent variable" (hello Thom). During the course of the regressions, I sometimes found that adding an additional variable would result in the Pvalue of a previously retained variable becoming statistically insignificant! Also, from previous modeling I had found that M3 carried a high correlation to several of the variables that I would now be using as independents. It's entirely likely that two different mathematicians could approach this problem, and through different variable selections and procedures, wind up with similarly predictive models that used slightly different inputs.
This model, along with the others, was developed using multivariate regression in Microsoft Excel. While it's possible to do singlevariate nonlinear regression directly in Excel through the charting function, you can't do it directly. However, you can fool the system through transforming the variables nonlinearly, and running the multivariate regression available through Tools>Data Analysis>Regression. This is what I did, and I threw the kitchen sink at it in an effort to find a more predictive model that didn't use M3. Note if you don't have a Regression option on your menu, you may need to use the Tools>Add Ins menu and check the Analysis ToolPak box, and your Excel CD should be handy, just in case.
Since Excel provides for only 16 independent and 1 dependent variable, in order to data mine the model, I proceeded in a stepwise fashion. Part of the standard regression output includes a calculation of student's t statistic and the probability or Pvalue associated with it, for each independent variable. As far as Pvalues are concerned, the smaller the better. Statistically speaking, the Pvalue is the chance that the observed correlation could be explained randomly, and most statisticians look at Pvalues below 5% as being significant. So I started at one end of the database, performed the regression, eliminated the nonsignificant variables, added in more variables to make 16 independents, performed the regression, etc. When I had eliminated all but 16 of the original independent variables I had selected to data mine, I was ready to backtest the model for effectiveness.
I selected several technicals from the prior S&P 500 performance to examine. They are: the return over the last 26 and 52 weeks; the current price's relationship to the highs of the last 26 and 52 weeks; and the current price's relationship to the lows of the last 26 and 52 weeks. All of the data for the S&P 500 previous prices came free from Yahoo!Finance.
To represent the "meddling" of the Fed in the U.S. stock markets, I examined two variables, the Effective Federal Funds rate and the aggregate value of M2 (weekly nonseasonally adjusted), which I obtained from the St. Louis Fed database. (For each of these I calculated the 26 week change and 52 week change. For M2, I also examined the change in 26 week change from the reading one quarter ago (this is similar to a calculation I found useful for M3).
In keeping with the relationship between Treasury yields and stock market performance, I included for examination the yield change on the 10YT over the last 26 and 52 weeks; the current yield's relationship to the highs of the last 26 and 52 weeks; and the current yield's relationship to the lows of the last 26 and 52 weeks.
Since I found serious correlations between M3 and the S&P 500, as well as similar correlations between M3 and the U.S. Dollar Index, it made sense that they may be crosscorrelated. So I examined the USDX change over the last 26 and 52 weeks; the current USDX's relationship to the highs of the last 26 and 52 weeks; and the current USDX's relationship to the lows of the last 26 and 52 weeks. All of the data for the USDX was downloaded from the St. Louis Fed database with the USDX values calculated in Excel to match the current calculation.
Since I have found many parabolic relationships in my stock market modeling, I included the squares of each of the above. I did not check for logarithmic or exponential relationships, mostly because I hadn't found any in previous models, although I suppose I could have looked at them, too. All told, that made 46 dependent variables to data mine for the (at most) 16 most significant.
By the sixth regression, I had arrived at my goal. The 16 retained variables include, in order of importance to the model:

7 of 12 possible USDX variables, with the general tendency of a strengthening dollar being good, unless it's overextended;

2 of 12 possible S&P 500 technicals, with the contrary tendencies of momentum on one hand and overextension/mean reversion on the other;

4 of 10 possible FedMeddler indicators, with recent strict monetary policy being bullish in the long run, and rapid changes in monetary policy being destabilizing; and

3 of 12 possible Treasury yield indicators, with recent strong up moves being bearish, and larger longterm moves being mildly stabilizing.
Here is the regression output from Excel (sans exact variable definitions)
Regression Statistics
Multiple R  0.471455 
R Square  0.222270 
Adjusted R Square  0.211760 
Standard Error  0.143527 
Observations  1201 
ANOVA
df  SS  MS  F  Significance F  
Regression  16  6.970584  0.435662  21.148675  3.47853E54 
Residual  1184  24.390332  0.020600  
Total  1200  31.360916 
Backtesting has validated the model to a large extent. Approximately 24 years of data on a weekly basis yields 1,201 data points.
By separating the historic model output into deciles, it's plain to see that actual outcomes fall into place very well with output. The numbers floating at the top left of the table are averages over the entire data set. For example, there were 121 observations where the output was in the 10th decile. For those 121 observations, the worst performance was a subsequent return on the S&P 500 of 5.9%, the median was 26.4%, etc. For those observations, 92.6% of them returned a positive gain on the S&P 500, 88.4% of them beat the average annual return, and there was never a bust (return of 10% or less). Unfortunately, middlerange output is not as predictive as when the model has an extremely high or low reading.
79.4%  50.0%  11.7%  
decile  count  min  median  average  max  stdev  gain  beat  bust 
1  120  29.3%  0.9%  1.3%  33.8%  14.8%  51.7%  16.7%  33.3% 
2  120  30.0%  7.3%  5.5%  39.5%  13.9%  72.5%  29.2%  15.0% 
3  120  28.5%  6.4%  5.5%  47.6%  16.0%  72.5%  24.2%  19.2% 
4  120  25.4%  6.4%  5.2%  43.0%  15.0%  68.3%  31.7%  20.0% 
5  120  25.6%  8.7%  8.9%  51.8%  15.8%  76.7%  44.2%  11.7% 
6  120  34.1%  13.4%  11.7%  56.0%  17.6%  79.2%  52.5%  13.3% 
7  120  16.7%  19.3%  17.5%  56.7%  13.1%  90.0%  64.2%  4.2% 
8  120  12.8%  20.4%  20.5%  57.7%  12.7%  94.2%  70.8%  0.8% 
9  120  4.9%  21.0%  20.4%  56.1%  10.2%  96.7%  77.5%  0.0% 
10  121  5.9%  26.4%  24.1%  45.1%  10.2%  92.6%  88.4%  0.0% 
However, even a very statistically significant RSquare of 22% has some inherent inaccuracy, as can be seen in the chart above, and in the table below.
Run Date  Model Output  Decile of Output  Subsequent Return 
12/27/94  0.241  10  0.341 
12/26/95  0.176  8  0.213 
12/30/96  0.168  8  0.303 
12/29/97  0.108  5  0.261 
12/28/98  0.019  1  0.195 
12/27/99  0.061  3  0.101 
12/26/00  0.057  3  0.097 
12/31/01  0.031  1  0.225 
12/30/02  0.051  2  0.220 
12/29/03  0.069  3  0.093 
12/27/04  0.207  9  0.048 
12/19/05  0.126  6  ???? 
Predictions for 2006  Part 2, Large Caps Picked to Outperform
The following disclaimer applies to all the predictions listed below.
As always, none of this is to be taken as a recommendation or solicitation to buy or sell securities. It's not advice in any form, it's just an opinion expressed by an individual, amateur investor.
In the interest of full disclosure, if I marked a stock with an asterisk (*) it means I have a beneficial interest in that stock at the time of writing. If the stock is marked with a double asterisk (**) then I have had, but do not currently have, a beneficial position in that stock at the time of writing. While I don't have a position in every stock I ever write about, one should assume that if I write about it, I may want a position in it, and my positions and opinions may change without notice (but not without reason).
In the large cap (over $5 billion) universe, I did find several that were appealing. These are, for the most part, profitable companies with high returns on assets, trading at attractive multiples from earnings, book value, and PEG, and pretty much contrarian picks in that they have not shown great returns over the last year and are currently underappreciated in the market. In alphabetical order, they are:
1. Dow Chemical (DOW)
2. Gannett Co. Inc* (GCI)
3. Gap Inc (GPS)
4. Magna International** (MGA)
5. P.P.G. Industries (PPG)
6. V.F. Corporation (VFC)
I also came across somewhat less appealing large caps, listed separately because of various departures from the above profile, i.e., either they've already had a decent runup, or they're PEG is high for my tastes, etc. I still believe that they have a significant chance to outperform the indices by the end of 2006, however.
7. Chevron* (CVX)
8. Coca Cola Enterprises (CCE, not KO)
9. ConAgra Food Inc. (CAG)
10. Huaneng Power Int. (HNP)
11. Paccar Inc. (PCAR)
12. Verizon Communications (VZ)
So there we have it, 12 large caps that I believe will outperform the broader market over the next year.
Predictions for 2006  Part 3, Small Caps Picked to Outperform
Typically, however, I prefer small caps to large caps. Not only is there more potential upside, the lack of analyst coverage and popular interest increases the odds of finding significantly mispriced securities. Note that I don't screen based on market capitalization, but when looking for financially strong, profitable companies whose stocks are trading at a significant discount, you don't find too many large caps that fit the bill.
1. Dorel Industries* (DIIB)
2. Golden Telecom* (GLDN)
3. Kindred Healthcare* (KND)
4. Sanderson Farms* (SAFM)
I found several that were also appealing to me, but seemed a bit more risky in my eyes for various reasons. However, I still think the following small caps have great potential to outperform the indices in 2006. Yes, I'm aware that LIZ is a midcap, but cut me some slack, here!
5. Affirmative Insurance (AFFM)
6. America's CarMart (CRMT)
7. Direct General (DRCT)
8. Helen of Troy (HELE)
9. Lacrosse Footwear (BOOT)
10. Liz Claiborne (LIZ)
11. Nash Finch Co. (NAFC)
12. Rocky Shoes and Boots (RCKY)
And there we go, 11 small caps and one mid cap that I believe will, as a portfolio, outperform the broader market by the end of 2006.
Predictions for 2006  Part 4, Large Caps Picked to Underperform
Based on the same type of modeling that I have been using in my own investing, I have assembled a list of one dozen large cap stocks that I think are poised to underperform the broader market by the end of 2006. In a similar format, I'll list alphabetically the ones I have the least confidence in first, then alphabetically the remainder of the dozen.
1. Schering Plough (SGP)
2. Serono SA ADS (SRA)
3. Shire Plc ADS (SHPGY)
IMO the first three might actually make good shorts over the long term, although they may be crowded  please note that I haven't checked their short ratios at this point.
4. Alcon Inc (ACL)
5. Chicago Merc. Exc. A (CME)
6. Cognizant Tech (CTSH)
7. Genentech Inc. (DNA)
8. Google (GOOG)
9. Infosys Techn ADS (INFY)
10. Paychex Inc. (PAYX)
11. Starbucks (SBUX)
12. Wipro Ltd. (WIT)
While I don't have confidence these will make good longterm shorts, they might be worth shorting on and off, since IMO they will either be rangebound or declining over the course of the year.
Predictions for 2006  Part 5, Very Confident They Will Underperform
Using the same historical database I mined to devise my investing style, I have been looking for indicators of dramatic underperformance. The following are a dozen stocks that I anticipate will not only underperform the indices by the end of 2006, but will actually decline  and additionally, these all have very small short ratios as of the most recent data, making them potentially attractive shorts. These are listed in descending order of market cap.
1. Sunpower Corp (SPWR)
2. Baidu.com (BIDU)
3. Theravance (THRX)
4. China Medical Tech (CMED)
5. Silver Wheaton Co. (SLW)
6. Birch Mountain Res. Ltd. (BMD)
7. Threshold Pharm (THLD)
8. Rackable Systems (RACK)  Oddly enough, Cramer just mentioned this as a bull!
9. Biocryst Pharm (BCRX)
10. Tan Range Exploration (TRE)
11. SIFY Limited ADR (SIFY)
12. Dexcom (DXCM)
As always, remember that you are the one responsible for what you do (or don't do), and happy investing for the New Year!