Search for lepton-number violating B+ ->X- l+ l'+ decays

We report on a search for eleven lepton-number violating processes B+ ->X- l+ l'+ with X- = K-, pi-, rho-, K*- or D- and l+/l'+ = e+ or mu+, using a sample of 471+/-3 million BBbar events collected with the BaBar detector at the PEP-II e+e- collider at the SLAC National Accelerator Laboratory. We find no evidence for any of these modes and place 90% confidence level upper limits on their branching fractions in the range $(1.5-26)\times 10^{-7}$.

In the Standard Model (SM), lepton-number conservation holds in low-energy collisions and decays but it can be violated in high-energy or high-density interactions [1]. The observation of neutrino oscillations [2] indicates that neutrinos have mass and, if the neutrinos are of the Majorana type, the neutrino and antineutrino are the same particle and processes that involve lepton-number violation become possible [3]. Many models beyond the SM predict that lepton-number is violated, possibly at rates approaching those accessible with current data [4]. Lepton-number violation is also a necessary condition for leptogenesis as an explanation of the baryon asymmetry of the universe [5].
We report here on a search for B + → X − ℓ + ℓ ′+ with X − = K − , π − , ρ − , K * − , or D − and ℓ + /ℓ ′+ = e + or µ + . We exclude the four combinations previously measured by the BABAR collaboration [7]. We use a data sample of 471 ± 3 million BB pairs, equivalent to an integrated luminosity of 429 fb −1 [11], collected at the Υ (4S) resonance with the BABAR detector at the PEP-II asymmetric-energy e + e − collider at the SLAC National Accelerator Laboratory. The e + e − center-of-mass (CM) energy is √ s = 10.58 GeV, corresponding to the mass of the Υ (4S) resonance (on-resonance data). The BABAR detector is described in detail in Ref. [12].
Monte Carlo (MC) simulation is used to identify the background contamination, calculate selection efficiencies, evaluate systematic uncertainties, and to cross-check the selection procedure. The signal channels are simulated by the EvtGen [13] package using a three-body phase space model. We also generate light quark qq continuum events (e + e − → qq, q = u, d, s), charm e + e − → cc continuum events, e + e − → µ + µ − (γ), Bhabha elastic e + e − scattering, BB background, and two-photon events [14]. Final-state radiation is provided by Photos [15]. The detector response is simulated with GEANT4 [16], and all simulated events are reconstructed in the same manner as data.
Particle identification is applied to all charged tracks. The charged pions and kaons are identified by measurements of their energy loss in the tracking detectors, and the number of photons and the Cherenkov angle recorded by the ring-imaging Cherenkov detector. These measurements are combined with information from the electromagnetic calorimeter and the muon detector to identify electrons and muons [12].
We select events that have four or more charged tracks, at least two of which must be identified as leptons. The ratio of the second-to-zeroth Fox-Wolfram moments [17] of the event must be less than 0.5 and the two charged leptons must have the same sign and momenta greater than 0.3 GeV/c in the laboratory frame. The separation along the beam axis between the two leptons at their closest approach to the beamline is required to be less than 0.2 cm. The combined momentum of the ℓ + ℓ ′+ pair in the CM system must be less than 2.5 GeV/c. Electrons and positrons from photon conversions are removed, where photon conversion is indicated by electron-positron pairs with an invariant mass less than 0.03 GeV/c 2 and a production vertex more than 2 cm from the beam axis.
The K * − is reconstructed through its decay to K 0 S π − and K − π 0 ; the ρ − and D − are reconstructed through their decays to π − π 0 and K + π − π − , respectively. The photons from the π 0 must have an energy greater than 0.03 GeV, and the π 0 is required to have an energy greater than 0.2 GeV, both measured in the laboratory frame. The reconstructed π 0 invariant mass must be between 0.12 and 0.16 GeV/c 2 . The invariant mass of the ρ − is required to be between 0.470 and 1.07 GeV/c 2 . The K 0 S must have an opening angle θ between its flight direction (defined as the vector between the B meson and K 0 S vertices) and its momentum vector such that cos θ > 0.999, a transverse flight distance greater than 0.2 cm, a lifetime significance τ /σ τ > 10, and a reconstructed invariant mass between 0.488 and 0.508 GeV/c 2 . We require the D − invariant mass to be between 1.835 and 1.895 GeV/c 2 . The invariant mass ranges are chosen to be wide enough to allow the background event distributions to be modeled.
The two leptons are combined with either a resonance candidate or a charged track to form a B-meson candidate. For muon modes, the invariant mass of each combination of a muon and a charged track from the B-meson candidate must be outside the region 3.05 < m ℓ + h − < 3.13 GeV/c 2 . This rejects events where a muon from a J/ψ decay is misidentified as a pion. The probability to misidentify a pion as a muon is approximately 2% and to misidentify as an electron is less than 0.1%.
We measure the kinematic variables m ES = (s/2 is the four-momentum of the CM system and p B is the B candidate momentum vector, both measured in the laboratory frame, √ s is the total CM energy, and E * B is the energy of the B candidate in the CM sys-tem. For signal events, the m ES distribution peaks at the B meson mass with a resolution of about 2.5 MeV/c 2 , and the ∆E distribution peaks near zero with a resolution of about 20 MeV. The B candidate is required to be in the kinematic region 5.240 < m ES < 5.289 GeV/c 2 . The ∆E range depends on the mode but always satisfies |∆E| < 0.3 GeV. The backgrounds arising from qq, cc, and BB events are suppressed through the use of a boosted decision tree discriminant (BDT) [18]. The BDT has nine inputs: the ratio of the second-and zeroth-order Fox-Wolfram moments based on the magnitude of the momentum of all neutral clusters and charged tracks in the rest-of-theevent (ROE) not associated with the B candidate; the absolute value of the cosine of the angle between the B momentum and the beam axis in the CM frame; the absolute value of the cosine of the angle between the B thrust axis [19] and the beam axis in the CM frame; the absolute value of the cosine of the angle between the B thrust axis and the thrust axis of the ROE in the CM frame; the output of the flavor tagging algorithm [20]; the separation along the beam axis between the two leptons at their points of closest approach to the beamline; the missing energy in the CM; the momentum of the lepton pair in the CM; and the boost-corrected proper-time difference between the decays of the two B mesons divided by its variance. The second B meson is formed by creating a vertex from the remaining tracks that are consistent with originating from the interaction region. The discriminant is trained for each signal mode using on-resonance data with m ES < 5.27 GeV/c 2 together with samples of simulated signal and background events. We compare the distributions of the data and the simulated background variables used as input to the BDTs and confirm that they are consistent.
After the application of all selection criteria, some events will contain more than one reconstructed B candidate. From the simulation sample, we estimate that the fraction of signal events with a π 0 that have more than one candidate is 30%, 13% for signal events with a K 0 S , and less than 6% for signal events where the lepton pair is combined with a D − , K − , or π − . We select the most probable B candidate from among all the candidates in the event using the χ 2 from the B candidate vertex fit. Averaged over all simulated signal events, the correct B candidate is selected with an accuracy of more than 83% for the signal events with a π 0 , greater than 96% for signal events with a K 0 S , and over 99% for signal events where the lepton pair is combined with a D − , K − , or π − . The final event selection efficiency for simulated signal is between 6% and 16%, depending on the final state. Figure 1 shows the variation of the selection efficiency as a function of the reconstructed invariant mass m X − ℓ + , calculated for both leptons, for four of the modes. The remaining modes have similar distributions.
For each mode, we extract the signal and background yields from the data with an unbinned maximum likelihood (ML) fit using where the likelihood for each event candidate i is the sum of n j P j ( x i ; α j ) over two categories j: the signal mode B + → X − ℓ + ℓ ′+ (including the small number of misreconstructed signal candidates) and background, as will be discussed. For each category j, P j ( x i ; α j ) is the product of the probability density functions (PDFs) evaluated for the i-th event's measured variables x i . The number of events for category j is denoted by n j and N is the total number of events in the sample. The quantities α j represent the parameters describing the expected distributions of the measured variables for each category j. Each discriminating variable x i in the likelihood function is modeled with a PDF, where the parameters α j are extracted from MC simulation or on-resonance data with m ES < 5.27 GeV/c 2 . The variables x i used in the fit are m ES , ∆E, and the multivariate discriminant BDT output; for modes involving a resonance, the resonance invariant mass is included as a fourth variable. The linear correlations between the four variables are found to be typically 4%-9% for simulated signal modes. Only B + → D − e + e + shows a larger correlation, between the invariant mass and ∆E, due to the occasional Bremstrahlung energy loss from the electrons. We take each P j to be the product of the PDFs for the separate variables and treat any correlations in the variables later as a source of systematic uncertainty. MC simulations show that the qq, cc, and BB backgrounds have similar distributions in the four variables af-ter the selection criteria have been applied and we therefore use a single background parameterization. An AR-GUS parameterization [21] is used to describe the m ES distribution. For ∆E, a first-or second-order polynomial is used or, for modes with a D meson, a Cruijff function [22]. The multivariate discriminant BDT output is fitted using a non-parametric kernel estimation KEYS algorithm [23]. The mass distributions for modes with a resonance are fitted with a first-order polynomial, together with a Gaussian function if the resonance is present in the backgrounds.
For the signal, the m ES distribution is parameterized with a Crystal Ball function [24]. A Crystal Ball function is also used for ∆E, together with a first-order polynomial for modes with a π 0 . The multivariate discriminant BDT output is taken directly from the MC distribution using a histogram. For the resonances, the signal masses are parameterized with two Gaussians, a relativistic Breit-Wigner function, and a Gounaris-Sakurai function [25] for the D − , K * − , and ρ − , respectively. The free parameters in the ML fit are the signal and background event yields, the slope of the background m ES distribution, and the polynomial parameters of the background ∆E and mass distributions.
We test the performance of the fits to B + → X − ℓ + ℓ ′+ by generating ensembles of MC datasets from both the PDF distributions and the fully simulated MC events; in the latter case, the correlations between the variables are correctly simulated. We generate and fit 10,000 datasets with the numbers of signal and background events allowed to fluctuate according to a Poisson distribution. The signal yield bias in the ensemble of fits is between −0.3 and 1.2 events, depending on the mode, and this is subtracted from the yield obtained from the data.
As a cross-check of the background PDFs, we perform a fit to a simulated background sample, with the same number of events as the on-resonance data sample. The number of fitted signal events is compatible with zero for all modes. We also perform a blinded fit to the onresonance data for each mode and confirm that the distributions of the background events are reproduced by the background PDFs. Events are identified as background if P bck /(P bck + P sig ) > 0.9, where P sig and P bck are computed for each event for the signal and background, respectively, without the use of the variable under consideration.
BABAR has previously published, using a different selection technique, four measurements of LNV in B + → h − ℓ + ℓ + where h − = K − or π − , and ℓ + ℓ + = e + e + or µ + µ + [7]. To validate the analysis reported here and as a cross-check only, we repeat the previous measurements, using the selection criteria described in this article. The reconstruction efficiencies are lower using this current analysis and the measured 90% C.L. branching fraction upper limits are less stringent compared to the previous results by between 3% and 80%, depending on the mode. This is compatible with the use here of a generic selection procedure for the eleven reported modes.
The results of the ML fits to the on-resonance data are summarized in Table I. The signal significance is defined as S = √ 2∆ ln L, where ∆ ln L is the change in loglikelihood from the maximum value to the value when the number of signal events is set to zero. Systematic errors are included in the ln L distribution by convolving the likelihood function with a Gaussian distribution with a standard deviation equal to the total systematic uncertainty, defined later in this article. If the log-likelihood corresponds to a negative signal, we assign a significance of zero. The branching fraction B is given by n s /(ηN BB ), where n s is the signal yield, corrected for the fit bias, η is the reconstruction efficiency, and N BB is the number of BB pairs collected. We assume equal production rates of B + B − and B 0 B 0 mesons. Figures 2 and 3 show the projections of the fit onto the discriminating variables for two of the modes, B + → π − e + µ + and B + → K * − µ + µ + (K * − → K 0 S π − ). The candidates in the figure are subject to the requirement on the probability ratio P sig /(P bck + P sig ) > 0.9, where P sig and P bck are computed without the use of the variable plotted. The other modes show similar distributions.
The systematic uncertainties in the branching fractions arise from the PDF parameterization, fit biases, background yields, and efficiencies. The PDF uncertainties are calculated by varying, within their errors, the PDF parameters that are held fixed in the default fit, taking into account correlations. For the KEYS algorithm, we vary the smearing parameter between 50% and 150% of the nominal value [23], and for the histograms we change the number of bins used. The uncertainty for the fit bias includes the statistical uncertainty in the mean difference between the fitted signal yield from the ensemble of 10,000 MC datasets described above and the signal yield from the fit to the default MC sample, and half of the correction itself, added in quadrature.
To calculate the contribution to the uncertainty caused by the assumption that the qq, cc, and BB backgrounds have similar distributions, we first vary the relative proportions of qq, cc, and BB used in the simulated background between 0% and 100% and retrain the BDT function for each variation. The new simulated background BDT PDF is then used in the fit to the data and the fitted yields compared to the default fit to data. The uncertainty is taken to be half the difference between the default fit and the maximum deviation seen in the ensemble of fits. All the uncertainties described previously are additive in nature and affect the significance of the branching fraction results. The total additive signal yield uncertainty is between 0.2 and 0.7 events, depending on the mode.
The sources of multiplicative uncertainties include: reconstruction efficiency from tracking (0.8% per track for the leptons and 0.7% for the kaon or pion, added linearly); neutral π 0 and K 0 S reconstruction efficiency (3.0% and 1.0%, respectively); charged particle identification (0.7% for electrons, 1.0% for muons, 0.2% for pions, 1.1% for kaons, added linearly); the BDT response from comparison to charmonium control samples such as B − → J/ψ X − (2.0%); and the number of BB pairs (0.6%) [12]. The total multiplicative branching fraction uncertainty is 5% or less for all modes.
When forming the overall branching fraction for the B + → K * − ℓ + ℓ ′+ decays, we assume that the overall K * − sub-mode additive uncertainties are uncorrelated and the multiplicative uncertainties are correlated.
As shown in Table I, we observe no significant yields. We use a Bayesian approach to calculate the 90% C.L. I: Summary of results for the measured B decay modes: total number of events in analysis region, signal yield ns (corrected for fit bias) and its statistical uncertainty, reconstruction efficiency η, daughter branching fraction product ΠBi(%), significance S (systematic uncertainties included), measured branching fraction B, and the 90% C.L. upper limit (BUL).

Mode
Events 117 −1.9 ± 4.7 7.9 ± 0.1 33.3 0.0 −1.5 ± 3.8 ± 0.4 6.5 branching fraction upper limits B UL by multiplying the likelihood distributions with a prior which is null in the unphysical regions (n s < 0) and constant elsewhere. The total likelihood distribution is integrated (taking into account statistical and systematic uncertainties) as a function of the branching fraction from 0 to B UL , such that BUL 0 L dB = 0.9 × ∞ 0 L dB. For the overall K * − ℓ + ℓ ′+ results, the total likelihood distributions for the two submodes are first combined before integration. The upper limits in all cases are dominated by the statistical uncertainty.
In summary, we have searched for eleven leptonnumber violating processes B + → X − ℓ + ℓ ′+ . We find no significant yields and place 90% C.L. upper limits on the branching fractions in the range (1.5 − 26) × 10 −7 . The limits for the modes with a ρ − , π − or K − are an order of magnitude more stringent than previous best measurements [10]. The limits for the B + → D − ℓ + ℓ ′+ are compatible with those reported in Ref. [8].
We are grateful for the excellent luminosity and machine conditions provided by our PEP-II colleagues, and for the substantial dedicated effort from the comput-ing organizations that support BABAR. The collaborating institutions wish to thank SLAC for its support and kind hospitality. This work is supported by DOE    b) ∆E; c) BDT output; and d) mass for the mode B + → K * − µ + µ + (K * − → K 0 S π − ). The points with error bars show the data; the dashed line is the background PDF; the solid line is the signal-plus-background PDF; and the solid area is the signal PDF.