Background The identification of predictors of treatment response holds tremendous potential for the improvement of clinical outcomes in bipolar disorder (BP). The goal of this project is to evaluate the test-retest reliability of a new clinical tool, the Lithium Questionnaire (LQ), for the retrospective assessment of long-term lithium use in research participants with BP. Methods Twenty-nine individuals with BP-I (n=27), major depression (n=1), or schizoaffective disorder (n=1) were recruited for participation. The LQ was administered to all participants at two time-points, spaced 17 months apart on average, and used to determine each subjects score on the Retrospective Criteria of Long-Term Treatment Response in Research Subjects with Bipolar Disorder Scale, or the Alda Scale. Scores were confirmed through a best-estimate procedure, and test-retest reliability (intra-class correlation coefficient [ICC]) of the LQ was calculated. Results The correlation between the total Alda Scale scores at the two time-points was in the moderate range (ICC=0.60). Relevant clinical factors such as age or presence of Axis I psychiatric comorbidity did not influence the reliability. Limitations The validity of the LQ was not examined. Inclusion of two participants with non-BP diagnoses may have affected the LQs reliability, but re-analysis of our data after exclusion of these participants did not influence the reliability. The absence of measures of mood and cognition at time of LQ may be a limitation of this work. Conclusions The LQ holds promise for the standardization of the retrospective assessment of long-term treatment in BP.